Abstract
I rigorously analyze a proposal, introduced by D.R. Terno, about a spatial localization observable for a Klein–Gordon massive real particle in terms of a Poincaré-covariant family of POVMs. I prove that these POVMs are actually a kinematic deformation of the Newton–Wigner PVMs. The first moment of one of these POVMs, however, exactly coincides with a restriction (on a core) of the Newton–Wigner self-adjoint position operator, though the second moment does not. This fact permits to preserve all nice properties of the Newton–Wigner position observable, dropping the unphysical features arising from the Hegerfeldt theorem. The considered POVM does not permit spatially sharply localized states, but it admits families of almost localized states with arbitrary precision. Next, I establish that the Terno localization observable satisfies part of a requirement introduced by D.P.L. Castrigiano about causal temporal evolution concerning the Lebesgue measurable spatial regions of any Minkowskian reference frame. The validity of the complete Castrigiano’s causality requirement is also proved for a notion of spatial localization which generalizes Terno’s one in a natural way.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A long-standing puzzling issue of theoretical and mathematical physics concerns the notion of spatial localization of a relativistic particle at given time. The problem is difficult because of a number of no-go results popped out over the years, after the seminal work of Newton and Wigner [31]. These theoretical snags establish that apparently natural proposals to define a spatial observable of a relativistic free particle (for a given Minkowski reference frame at a certain time) are actually forbidden by general requirements concerning causal locality and positivity of the energy. The first victim of these no-go results is the very Newton–Wigner localization notion.
My opinion is that this issue has been quite overlooked in spite of being urgent: after all, experimental physicists can assert, with a certain approximation, where a relativistic particle has been detected at a certain time in laboratories. What theoretical notion describes these kinds of claims by our colleagues?
The notion of position observable not only is perfectly defined in the non-relativistic regime, but it plays a very central role in the theoretical construction of the corpus itself of the quantum theory. The notion of position is involved in the first version of the canonical commutation relations and the theoretical explanation of the Heisenberg principle. How is it possible that a so crucial theoretical notion simply fades out when we pass to the relativistic regime?
The situation is very delicate from the physical perspective. First of all, we know that, trying to localize a particle under its Compton length, gives rise to a pair of particles, so that a sharp localization seems not possible. In this sense the detectors should play an active role [1]. However, that is a physical fact which is predicted by interacting QFT. It is not clear how such an obstruction should take place in an elementary (perhaps naive) mathematical description that disregard the effects of Quantum Field Theory.
In author’s view, however, the intricate nature of the problem is also due to a frequent confusion in the literature concerning two entangled, but actually physically distinct issues.
-
(I1)
On the one hand, one can focus on the properties and theoretical assumptions on the probability of spatial localization, without paying attention to the post-measurement state. In that case, the major obstructions against apparently natural definitions of localization observables arise from a class of theoretical results cumulatively called Hegerfeldt’s theorem [20, 21] and their more advanced re-formulations [8, 9]. They at least prove that no sharp localization is possible if the generator of time evolution is bounded below. Sharp localization would imply non-local features of the (time evolution of the) position probability distributions: a superluminal spread of the probability distribution [34]. These no-go results concern any general description of the spatial localization observable (at a given time) in terms of positive-operator valued measures (POVMs) and not only projection-valued measures (PVMs). Castrigiano [8] formulated a precise causality condition ((b) in Definition 15) that every physically acceptable POVM (or PVM)—which describes spatial localization—should satisfy independently of the issue of the post-measurement state.
-
(I2)
On the other hand, one can (also) focus on the post-measurement state arising after a position measurement. In that case, a list of no-go results has been accumulated over the years starting from the so-called Malament theorem [27]. It, in particular, establishes that localization cannot be described in terms of PVMs—i.e., not even in terms of self-adjoint operators. It happens when (a) the post-measurement state is produced by a projective measurement, (b) the PVM satisfies natural requirements of locality (according to Hellwig–Kraus’ analysis [22]), and (c) the generator of time evolution is positive or bounded below. Reinforcing the hypotheses of Malament statement, the no-go result can be extended to localization observables in terms of POVMs as established first by Busch [5] and by Halvorson and Clifton later [19], when a suitable post-measurement procedure has been chosen (essentially an ideal Lüders measurement).
However, there is no automatic way to pass from (I1) to (I2), especially when the position observable is described in terms of a POVM. There are infinitely many measurement schemes (based on completely positive maps) which give rise to the same PVM or POVM while the post-measurement states are completely different. This arbitrariness was already noticed by von Neumann in his seminal book on the mathematical foundations of Quantum Mechanics and it is a fundamental tool in the modern theory of quantum measurement [6]. The fact that the values of a position observable are continuous is a further source of problems. Continuity of outcomes rules out all naive state-updating procedures on account of the crucial Ozawa theorem [32]. The standard projective Lüders scheme is physically untenable in this case, even if it is always described as the prototype of all state updating processes in many textbooks of quantum mechanics.
Referring to (I2), it seems to me that these no-go results against every notion of spatial localization observable always rely on a precise choice of the description of the post-measurement state in terms of Kraus operators (e.g., they are the square root of the effects of the POVM). In my opinion, this choice appears to oscillate between being too naive or too arbitrary. Therefore, some apparently definite claims, relying upon the issue (I2), about the non-existence of any spatial localization observable [19] do not seem really motivated up to now. Even if they impose some severe constraint on the measurement scheme, the last word has not been said in my view.
Both issues (I1) and (I2) rule out, in particular, the already cited Newton–Wigner position observable [31] of a quantum relativistic particle.
The Newton–Wigner position observable is described in terms of a family of PVMs \({{\textsf{Q}}}_{n,t}={{\textsf{Q}}}_{n,t}(\Delta )\)—where \(\Delta \) ranges in the measurable sets of the rest 3-space \(\Sigma _{n,t}\) of every given Minkowski reference frame n at every given time t. This family of PVMs is covariant with respect to the Poincaré group. It is worth stressing that covariance with respect to spatial Euclidean subgroup (and some further technical hypotheses) uniquely determine the family of \({{\textsf{Q}}}_{n,t}\) as a consequence of Mackay imprimitivity theory as proved by Wightman [38]. This is one of the theoretical motivations which make the NW position observable quite appealing.
In view of the spectral theorem, the information of the family of PVMs \({{\textsf{Q}}}_{n,t}\) is completely encapsulated in the assignment of a set of self-adjoint operators, the Newton–Wigner position operators
where \(x^0=t\) and the Minkowski coordinates (self-adjoint operators) \(N_{n,t}^1,N_{n,t}^2,N_{n,t}^3\) of a particle are co-moving with n. Obviously \(N^0_{n,t}=tI\).
To make more intricate the issue, the Newton–Wigner position self-adjoint operator \(N^\mu _{n,t}\) possesses quite natural and appealing properties in spite of the fact that the associated PVM violates basic local-causality principles. In particular (see Sect. 3), explicitly referring to the case of a scalar massive particle:
-
(i)
natural covariance properties with respect to the Lorentz (and Poincaré) group take place (on a suitable domain):
$$\begin{aligned} U_\Lambda N_{n,t}^\alpha U_\Lambda ^{-1} = (\Lambda ^{-1})^\alpha _\beta N^{\beta }_{\Lambda n, t_{\Lambda }}\,, \quad \forall \Lambda \in O(1,3)_+\,; \end{aligned}$$ -
(ii)
a quite natural relativistic version of Ehrenfest’s theorem is valid for \(k=1,2,3\):
$$\begin{aligned} U^{(n)\dagger }_t N_{n,0}^k U^{(n)}_t = N_{n,t}^k = N_{n,0}^k + t\frac{P_{nk}}{P_{n0}}\,; \end{aligned}$$ -
(iii)
the worldline determined by the expectation values \(\langle \psi |N^\mu _{t,n} \psi \rangle \) is timelike as is expected by a massive particle:
$$\begin{aligned} \sum _{k=1}^3 \left( \frac{\textrm{d}}{\textrm{d}t} \langle \psi | N^{k}_{n,t} \psi \rangle \right) ^2 < 1 \,; \end{aligned}$$ -
(iv)
Heisenberg’s commutation relations are satisfied on a suitable dense invariant domain (a core)
$$\begin{aligned} {[}N_{n,t}^k, N_{n,t}^h]=[ P_{n h}, P_{n k}]=0\,, \qquad [ N_{n,t}^k, P_{n h}]= i\hbar \delta ^k_hI \,, \end{aligned}$$ -
(v)
this, in particular, produces the standard statement of the Heisenberg principle;
-
(vi)
when the energy content of a state vector \(\psi \) is small if compared with the \(mc^2\) of the particle, then the \(N^k_{n,0}\psi \) tends to become \(X^k\psi \), where \(X^k\) is the non-relativistic position operator.
This paper is devoted to address the issue (I1) for a scalar Klein–Gordon particle with mass \(m>0\). To this end, a recent proposal of (non-commutative) POVM localization observable \({{\textsf{A}}}_{n,t}(\Delta )\) will be considered for massive spin-0 particles. This proposal was due to Terno [36]. This notion of localization, contrarily to the Newton–Wigner notion of localization does not admit sharply localized states (Proposition 25), so that it is not in automatic conflict with the Hegerfeldt theorem. However, it admits states which resemble localized states with arbitrarily fine approximation (Proposition 25). An idea of proof that the spatial decay of the Terno probabilities does not trigger the Hegerfeldt’s superluminal phenomena appears in [36]. We shall rigorously prove this fact as a byproduct of the achievement (B) below.
We shall show (Theorem 22) that the POVM \({{\textsf{A}}}_{n,t}(\Delta )\) is actually a kinematic deformation of the PVM \({{\textsf{Q}}}_{n,t}(\Delta )\) in terms of the components of the four-momentum \(P_n^\mu \) in the used Minkowski reference frame n:
This relation implies, in particular, that the family of POVMs \({{\textsf{A}}}_{n,t}\) satisfies a covariance property with respect to the Poincaré group analogous to the one satisfied by \({{\textsf{Q}}}_{t,n}\).
Three main results are next achieved in this paper by expanding and making mathematically rigorous some definitions and results discussed in [36] and referring to some ideas introduced in [8].
-
(A)
Theorem 26 proves that, in spite of the difference of the two POVMs, the first-moment operator \(X^\alpha _{n,t}\) of Terno’s POVM coincides with the Newton–Wigner position operator. Therefore, \(X^\alpha _{n,t}\) preserves all good properties (i)–(vi) of that operator listed above but (v). In fact, a corrected version of the Heisenberg inequality will be established
$$\begin{aligned} \Delta _\psi X^k_{n,t} \Delta _\psi P_{nk} \ge \frac{\hbar }{2} \sqrt{1 + 2\Delta _\psi P_{n,k}^2 \left\langle \psi \left| \frac{(P_{n0})^2-(P_{nk})^2}{(P_{n0})^{4}}\psi \right. \right\rangle }\,. \end{aligned}$$It evidently reproduces the standard inequality for large values of the mass.
-
(B)
Theorem 35 proves that Terno’s notion of spatial localization satisfies a consequence of the causality requirement introduced by Castrigiano [8] as conjectured by Terno [36]. The validity of this condition rules out, in particular, the obstruction represented by the Hegerfeldt’s theorem. This pair of achievements promote \({{\textsf{A}}}_{n,t}(\Delta )\) to be a very good candidate for the relativistic notion of spatial localization of a massive scalar particle from the viewpoint of the issue (I1) at least.
-
(C)
The validity of the complete Castrigiano causality requirement is finally established (Theorem 39). However, this result needs an improved version of the family of POVMs \({{\textsf{A}}}\) and a delicate discussion about the physical nature of spatial localization.
In the recent years, several interesting problems related to the issue (I2) and local causality have been fruitfully addressed in the setting of algebraic quantum field theory by Fewster, Verch and collaborators [4, 13, 14] in a given curved (globally hyperbolic) spacetime. These papers complete and largely extend the fundamental analysis by Hellwig and Kraus [22]. In that case, the relevant notion of localization refers to spacetime regions and to generic local observables in the Haag-Kastler setting. This paper instead deals with single particles (not quantum fields) and the localization refers to the space of a reference frame at a given time in Minkowski spacetime. It is clear that this is an ideal description which perhaps will reveal unphysical eventually, since realistic measurements take a finite lapse of time necessarily. However, up to now, this type of ideality does not seem a source of the above-mentioned obstructions to the definition of a physically meaningful notion of spatial localization. On the other hand, it seems remarkable the fact that the Terno notion of spatial localization is actually a byproduct of QFT, at least from a heuristic perspective: it arises from the normally-ordered stress-energy tensor operator whose nature is intrinsically part of basic constructions of QFT.
This paper is organized as follows. Section 2 contains a quick technical recap on the massive Klein–Gordon field in Minkowski spacetime, stressing, in particular, on the covariance properties with respect to the relevant Poincaré unitary representation. Section 3 introduces the Newton–Wigner notion of spatial localization according to Wightman viewpoint. Section 4 illustrates some well-known problems with the NW notion of localization also presenting general Castrigiano’s causality requirement and the notion of causal time evolution, proving that this notion of localization is ruled out by the Hegerfeldt theorem. Section 5 introduces the notion of spatial localization presented by Terno into a rigorous setting and establishes some important properties of it. Section 6 proves that this notion of spatial localization is in agreement with Castrigiano’s notion of causal time evolution. Section 7 focuses on the causality condition proposed by Castrigiano by introducing a second family of POVMs depending on a pair of reference frames. The final section is devoted to a discussion on the achieved results and possible developments.
2 Minkowski spacetime and Klein–Gordon massive particles
2.1 Minkowski spacetime
In the rest of the paper, the Minkowski spacetime \({{\mathbb {M}}}\) is described as a four-dimensional real affine space—whose vector space of translations is denoted by \({\textsf{V}}\)—endowed with a Lorentzian metric g in \({\textsf{V}}\) with signature \(-,+,+,+\). A basis \(\{v_0,v_1, v_2,v_3\}\in {\textsf{V}}\) is said to be pseudo-orthonormal if \(g(v_\mu ,v_\nu ) = \eta _{\mu \nu }\), where \([\eta _{\mu \nu }]= diag(-1,1,1,1)\).
Causal vectors \(v\in {\textsf{V}}\) satisfy per definition \(g(v,v) \le 0\) and \(v\ne 0\). Causal vectors with \(g(v,v)=0\) are said null or lightlike. They are timelike if \(g(v,v) <0\). Finally, spacelike vectors satisfy \(g(v,v)>0\).
\(({{\mathbb {M}}},g)\) is time-oriented, i.e., we choose a preferred half \({\textsf{V}}_+\) of the open cone of the timelike vectors, \(g(v,v)<0\). The (causal!) vectors in \(\overline{V_+}\setminus \{0\}\) are said future-directed. \({{\textsf{T}}}_+:=\{ v\in {\textsf{V}}_+ \,|\, g(v,v) = -1\}\) is the set of unit future-directed timelike vectors. The remaining half of the open cone \({\textsf{V}}\) of timelike vectors is denoted by \({\textsf{V}}_-\). The past-directed causal vectors are the elements of \(\overline{{\textsf{V}}_-}\setminus \{0\}\). The past-directed timelike and lightlike vectors are analogously the elements of \({\textsf{V}}_-\) and \(\partial {\textsf{V}}_-\setminus \{0\}\), respectively.
\(J^+(S) \subset {{\mathbb {M}}}\) denotes the causal future of \(S\subset {{\mathbb {M}}}\). It is the set of events \(e\in {{\mathbb {M}}}\) such that there is some \(e'\in S\) such that \(e-e' \in \overline{{\textsf{V}}_+}\). An analogous definition is valid for the causal past \(J^-(S)\) of S. Notice that \(S\subset J^{\pm }(S)\), \(A\subset B\) implies \(J^\pm (A) \subset J^\pm (B)\), and \(J^\pm \left( \bigcup _{\alpha \in A}S_\alpha \right) = \bigcup _{\alpha \in A}J^\pm (S_\alpha )\).
Remark 1
Throughout \(v\cdot u:= g(v,u)\) when \(u,v\in {\textsf{V}}\). The light speed is \(c=1\) and the Planck constant satisfies \(\hbar = 1\) unless I will specify otherwise. \(\blacksquare \)
2.2 Poincaré group, reference frames, and all that
I adopt the conventions of [15] regarding the interpretation of the relevant groups of transformations in \({{\mathbb {M}}}\). The orthochronous Lorentz group \(O(1,3)_+\) is the group of linear maps \(\Lambda : {\textsf{V}}\rightarrow {\textsf{V}}\) which both preserve the metric g and \({\textsf{V}}_+\). The orthochronous Poincaré group \(IO(1,3)_+\) is the group of affine maps \({{\mathbb {M}}}\rightarrow {{\mathbb {M}}}\) whose associated linear map belongs to \(O(1,3)_+\).
If \(A\subset {{\mathbb {M}}}\) and \(h\in O(1,3)_+\), then \(hA:= \{h(e)\,|\, e \in A\}.\)
Every \(n\in {{\textsf{T}}}_+\) defines a corresponding (Minkowskian) reference frame in \({{\mathbb {M}}}\). The three-dimensional rest spaces of the reference frame n are the three-planes (pseudo-ortho) normal to n. To label them, one chooses a preferred point \(o \in {{\mathbb {M}}}\) called origin. (Everything is discussed in this paper does not depend on this choice.) A rest space of \(n \in {{\textsf{T}}}_+\) is therefore denoted by \(\Sigma _{n,t}\), where \(t\in {{\mathbb {R}}}\) indicates the signed distance (the proper time of n) of \(\Sigma _{n,t}\) from o:
With a choice of the origin \(o\in {{\mathbb {M}}}\), the orthochronous Poincaré group \(IO(1,3)_+\) is isomorphic to the semidirect product of \(O(1,3)_+\) and \({\textsf{V}}\) itself and acts as follows
By construction, if \(h:=(\Lambda _h,a_h) \in O(1,3)_+\),
Notice that it turns out that \(t_{h} = t- a \cdot \Lambda _h n\) does not depend on the choice of \(e\in \Sigma _{n,t}\).
The Euclidean group \({{{\mathcal {E}}}}_n\) of \(\Sigma _{n,t}\), i.e., the group of \(h_{n,t}\)-isometries, coincides with the subgroup of IO(1, 3) of elements \((\Lambda ,a)\), which preserve n:
With the choice of an origin o, \({{\mathbb {M}}}\) is identified to \({\textsf{V}}\) by means of the bijective map \(M \ni e \mapsto e-o \in {\textsf{V}}\). The choice of a basis \(\{v_1,\ldots , v_4\}\subset {\textsf{V}}\) defines a (global) Cartesian coordinate system of origin o given by \({{\mathbb {M}}}\ni e \mapsto (x^1(e),\ldots , x^4(e)) \in {{\mathbb {R}}}^4\) where \(e= o+ \sum _{\alpha =1}^4 x^\alpha (e)v_\alpha \). That system of Cartesian coordinates is said to be Minkowskian if the basis is pseudo-orthonormal. A Minkowskian coordinate system, with coordinates \(x^0=t,x^1,x^2,x^3\), is co-moving with \(n\in O(1,3)_+\) if \(\frac{\partial }{\partial x^0}=n\). Evidently \(x^1,x^2,x^3\) define (global) Cartesian orthonormal coordinates on each \(\Sigma _{n,t}\) referring to the Euclidean metric \(h_{n,t}\) induced on it by g.
\({\mathscr {B}}(\Sigma _{n,t})\) will denote the family of Borel subsets on \(\Sigma _{n,t}\). Independently of the choice of the coordinates, \(h_{n,t}\) induces a positive regular Borel measure \(\textrm{d}\Sigma _{n,t}\) on \(\Sigma _{n,t}\). In the above coordinates \(x^1,x^2,x^3\), that measure is the restriction \(d^3x=\textrm{d}x^1\textrm{d}x^2\textrm{d}x^3\) of the Lebesgue measure on \({{\mathbb {R}}}^3\) to the Borel sets. The completion of \(d^3x\) is the Lebesgue measure itself as a consequence. The corresponding completion of \(\textrm{d}\Sigma _{n,t}\) will be named Lebesgue measure on \(\Sigma _{n,t}\). I will make use of the same symbol \(\textrm{d}\Sigma _{n,t}\) for a measure and its completion as the difference will be clear from the choice of the used \(\sigma \)-algebra. The Lebesgue \(\sigma \)-algebra on \(\Sigma _{n,t}\) will be denoted by \({\mathscr {L}}(\Sigma _{n,t})\).
2.3 Completion of measures and \(L^2\) spaces
A positive \(\sigma \)-additive measure \(\mu : \Sigma (X) \rightarrow [0,+\infty ]\) and its completion \({\overline{\mu }}: \overline{\Sigma (X)} \rightarrow [0,+\infty ]\) give rise to the same Hilbert space \(L^2(X, \mu )\) since (see, e.g., Proposition 1.57 [28]), for every \(\overline{\Sigma (X)} \)-measurable function f, there is a \(\Sigma (X)\)-measurable function g such that \(f=g\) is true \(\mu \)-almost everywhere and either \(\int _X f \textrm{d}{\overline{\mu }}= \int _X g \textrm{d}\mu \) or both the integrals do not exist. The identity evidently extends to \(L^2\)-scalar products of pairs of corresponding functions. The map \(L^2(\mu ) \ni [f]_\mu \mapsto [f]_{{\overline{\mu }}} \in L^2({\overline{\mu }})\) is a Hilbert space isomorphism.
2.4 Hilbert space and Poincaré group representation for the massive Klein–Gordon particle
In the rest of this work, I will take advantage of the Einstein convention of summation over repeated Greek indices, from 0 to 3.
Let us consider a Klein–Gordon real particle of mass \(m>0\) described by the \(C^\infty \) scalar field \(\varphi : {{\mathbb {M}}}\rightarrow {{\mathbb {R}}}\) satisfying the normally hyperbolic Klein–Gordon equation
As is well-known, the quantization of that system, viewed as the restriction to the one-particle space of the second quantization procedure, relies on the Hilbert space of pure state vectors
Above, if \({\textsf{V}}_{m,+}:= \{p\in {\textsf{V}}\,|\, g(p,p) = -m^2\,, \,p\in {\textsf{V}}_+\} \) denotes the mass shell of (positive energy) four-momenta of mass m, the Hilbert space inner product reads
Above, \(\mu _m(p)\) is the Lorentz-invariant (positive Borel regular) measure which takes the form
in every Minkowskian reference frame co-moving with \(n\in {{\textsf{T}}}_+\), \(d^3p= \textrm{d}p^1\textrm{d}p^2\textrm{d}p^3\) being the standard Lebesgue measure on \({{\mathbb {R}}}^3\) identified with the rest spaces of n by means of any Minkowskian coordinate system co-moving with n (that measure is independent of the chosen Minkowskian coordinate frame co-moving with n). Notice that
are, respectively, the n-temporal component and n-spatial component of the four-momentum p, respectively, corresponding to \(p^0\) and the triple \((p^1,p^2,p^3)\) in any Minkowski coordinate system co-moving with n. As \(E_n(p)\) depends only on \(\vec {p}_n\), I will occasionally write \(E_p(\vec {p}_n)\) in place of \(E_n(p)\).
As usual, the (normal pure) quantum states of the particle are represented by the unit vectors \(\psi \in {{{\mathcal {H}}}}\) up to phases.
The inner product (5) is invariant under the strongly-continuous unitary (active) action induced byFootnote 1 (2) of the orthochronous Poincaré group \(IO(1,3)_+\):
This invariance property arises from the \(O(1,3)_+\) invariance of \(\mu _m\):
The action of time translations subgroup along the time direction \(n\in {{\textsf{T}}}_+\) reads
so that the self-adjoint generator of the one-parameter group, the multiplicative operator
has negative spectrum since \(\sigma (H_n) = \sigma _c(H_n) = [m,+\infty )\). In this formalism, the time evolutor in n is
\(H_n\) is the Hamiltonian operator in the reference frame \(n\in {{\textsf{T}}}_+\). The self-adjoint generators of the spatial translations
in n along the spatial unit vectors \(v_k\) of a co-moving Minkowskian coordinate system are therefore the multiplicative operators
Evidently, \(\sigma (P_{nk})= \sigma _c(P_{nk})= {{\mathbb {R}}}\) for \(k=1,2,3\).
The operators \((P_{n0}, P_{n1}, P_{n2}, P_{n3})\) define the (covariant) components of the four-momentum in n with respect to the relevant Minkowskian coordinate system co-moving with n. No specification of time t is necessary because \(P_{n \alpha }\) is trivially a constant of motion.
Definition 2
We say that \(\psi \in {{\mathcal {H}}}\) is of Schwartz type if there is \(n\in {{\textsf{T}}}_+\) and a Minkowski coordinate system co-moving with n such that \({{\mathbb {R}}}^3 \ni \vec {p} \mapsto \psi (E_n(p), \vec {p}_n)\in {{\mathbb {C}}}\) stays in \({{\mathscr {S}}}({{\mathbb {R}}}^3)\) (the Schwartz space on \({{\mathbb {R}}}^3\)) when represented in the spatial coordinates on \({{\mathbb {R}}}^3\). The \({{\mathcal {H}}}\) subspace of vectors of Schwartz type will be denoted by \(\mathcal{S}({{{\mathcal {H}}}})\).
Proposition 3
The definition of \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) does not depend of the choice of n and co-moving Minkowskian coordinates. That is equivalent to saying the \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is invariant under the representation U of \(IO(1,3)_+\) in (7). Finally, \(\mathcal{S}({{{\mathcal {H}}}})\) is dense in \({{{\mathcal {H}}}}\).
Proof
See “Appendix A” \(\square \)
Proposition 4
\({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is invariant under the components of the four-momentum \(P_{n\alpha }\), \(\alpha =0,1,2,3\), referred to a reference frame \(n\in {{\textsf{T}}}_+\). Furthermore, \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is a core for each those symmetric operators (i.e., each of them is essentially self-adjoint thereon).
Proof
See “Appendix A” \(\square \)
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), the associated covariant wavefunction (the name is justified by (14) below) is
where \(x(e) = e-o\in {\textsf{V}}\) is the vector representation of the events in \({{\mathbb {M}}}\) with respect to the origin o.
Proposition 5
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), the associated wavefunction \(\varphi _\psi \) satisfies the following.
-
(1)
\(\varphi _\psi \in C^\infty (M; {{\mathbb {C}}})\) and \(\varphi _\psi (t,\cdot ) \in {\mathscr {S}}({{\mathbb {R}}}^3)\) for every \(t\in {{\mathbb {R}}}\), where \({{\mathbb {R}}}^3 \equiv \Sigma _{n,t}\) through the choice of a Minkowskian coordinate system co-moving with any chosen \(n\in {{\textsf{T}}}_+\).
-
(2)
The Klein–Gordon equation is valid, \(\Box \varphi _\psi - m^2 \varphi _\psi =0\,.\)
-
(3)
If also \(\psi ' \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), then
$$\begin{aligned} \langle \psi |\psi ' \rangle = \frac{i}{2}\int _{\Sigma _{n,t}} \left( \overline{\varphi _\psi } \partial _n \varphi _{\psi '} - \overline{\varphi _{\psi '}} \partial _n \varphi _\psi \right) \textrm{d}\Sigma _{n,t} \end{aligned}$$(13)where the right-hand side does not depend on the choice of both \(n\in {{\textsf{T}}}_+\) and \(t\in {{\mathbb {R}}}\) since the left-hand side does not.
-
(4)
The action (7) of \(IO(1,3)_+\) induces the standard active action on scalar fields in \({{\mathbb {M}}}\),
$$\begin{aligned} \varphi _{U_{(\Lambda , a)}\psi }(x) = \varphi _\psi \left( \Lambda ^{-1}(x-a)\right) \,. \end{aligned}$$(14)
Finally, \({{{\mathcal {H}}}}\) coincides with the completion of \(\mathcal{S}({{{\mathcal {H}}}})\) equipped with the inner product provided by the right-hand side of (13).
I leave the proof of these very well-known facts to the reader. They are based on elementary results of the theory of Fourier(-Plancherel) transform. The last statement immediately arises from (13) and the last statement of Proposition 3.
3 The Newton–Wigner observable for the massive Klein–Gordon particle
3.1 The Newton–Wigner PVM
I assume that the reader is well acquainted with basic notions of spectral theory and the notion of Projection Valued Measure (PVM) (see, e.g., [28, 29]).
Consider a (separable) Hilbert space \({{{\mathcal {H}}}}\) that defines the pure states of a quantum particle, not necessarily Klein–Gordon nor relativistic, but possibly equipped with spin and other internal observables. According to Wightman [38],
Definition 6
A Newton–Wigner PVM [31, 38] for a particle described in the (complex, separable) Hilbert space \({{{\mathcal {H}}}}\) is defined as a PVM \({{\textsf{P}}}: {\mathscr {B}}({{\mathbb {R}}}^3) \rightarrow {{\mathfrak {B}}}({{{\mathcal {H}}}})\)—where \({\mathscr {B}}({{\mathbb {R}}}^3) \) is the Borel \(\sigma \)-algebra of \({{\mathbb {R}}}^3\)—which is covariant with respect to a strongly continuous unitary representation V of the group of isometries \({{{\mathcal {E}}}}\) of \({{\mathbb {R}}}^3\) in \({{{\mathcal {H}}}}\):
\({{\mathbb {R}}}^3\) is above interpreted as the joint spectrum of three Newton–Wigner position self-adjoint operators
Remark 7
Wightman, on an account of Mackey’s imprimitivity systems theory, established the uniqueness of a Newton–Wigner position observable of a given unitary and strongly-continuous representation V of the Euclidean group \({{{\mathcal {E}}}}\) under suitably regularity requirements on V and invariance under time-reversal symmetry. A more recent discussion appears in [8]. For a technically extensive discussion concerning relativistic systems with every value of the square mass (also understood as an operator) and the spin, see [8, 9]. \(\blacksquare \)
According to the general interpretation of the formalism, the physical interpretation of a Newton–Wigner PVM is that \(\langle \psi | {{\textsf{P}}}(\Delta )\psi \rangle \) is the probability to find the particle in the region \(\Delta \subset {{\mathbb {R}}}^3\) when the pure state is represented by \(\psi \in {{{\mathcal {H}}}}\).
In the case of the real scalar Klein–Gordon particle, a Newton–Wigner PVMFootnote 2\({{\textsf{Q}}}_{n,t}\) is constructed as follows on the rest 3-space \(\Sigma _{n,t}\) of a reference frame \(n\in {{\textsf{T}}}_+\). Here, the restriction V of \(U: IO(1,3)_+ \rightarrow {{\mathfrak {B}}}({{{\mathcal {H}}}})\) (7) to the Euclidean subgroup \({{{\mathcal {E}}}}_n\) (4) is used to implement Wightman’s definition. As before, events \(e\in {{\mathbb {M}}}\) are identified with vectors through \(x(e) = e-0 \in {\textsf{V}}\).
If \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) define
Above \(\Delta \in {\mathscr {B}}(\Sigma _{n,t})\) and \(\textrm{d}\Sigma _{n,t}(x)= d^3x\) in Minkowskian coordinates co-moving with n. As the mathematical tools appearing in the formula are coordinate independent for a choice of \(n\in {{\textsf{T}}}_+\), the operator on the left-hand side only depends on (n, t). The found family of operators defines a Newton–Wigner observable on every slice \(\Sigma _{n,t}\) according to Wigner’s definition because of the following result.
Proposition 8
Each operator of the \((n,t, \Delta )\)-parametrized family (17) defined on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and taking values in \({{\mathcal {H}}}\), uniquely extends by continuity to the whole space \(\mathcal H\). The found family of operators, for \(t\in {{\mathbb {R}}}\) fixed, defines a PVM on \({\mathscr {B}}(\Sigma _{n,t})\) satisfying the covariance requirement (15) with respect to the group of isometries \({{{\mathcal {E}}}}_n\) (4) of \(\Sigma _{n,t}\).
If indicating the found orthogonal projectors with the same symbol \({{\textsf{Q}}}_{n,t}(\Delta )\), the action of \(IO(1,3)_+\) on them reads
Proof
Fix a Minkowskian coordinate system co-moving with n. Define the unitary map
where \(\vec {p}_n\equiv (p_1,p_2,p_3) \in {{\mathbb {R}}}^3\) according to the said choice of a Minkowskian coordinate system. Notice that, as \(m>0\), the written map restricts to a bijection from \({{{\mathcal {S}}}}(\mathcal{H})\), which is dense in \({{{\mathcal {H}}}}=L^2({\textsf{V}}_{m,+}, \mu _m)\), onto \({\mathscr {S}}({{\mathbb {R}}}^3)\) viewed as dense subspace of \(L^2({{\mathbb {R}}}^3, d^3p)\). We then have that, for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), (17) can be reformulated as
Above, \(1_\Delta \) is the multiplicative operator with the characteristic function of \(\Delta \in {{\mathbb {R}}}^3 \equiv \Sigma _{n,t}\) (\(1_\Delta ({x})=1\) if \({x}\in \Delta \) and \(1_\Delta ({x})=0\) otherwise); \({{\mathcal {F}}}: L^2(\Sigma _{n,t}, \textrm{d}\Sigma _{n,t}) \rightarrow L^2({{\mathbb {R}}}^3, d^3p)\) is the Fourier–Plancherel unitary transform (after having identified \(\Sigma _{n,t}\) with \({{\mathbb {R}}}^3\) and \(\textrm{d}\Sigma _{n,t}\) with the Lebesgue measure \(d^3x\) with the same a choice of a Minkowskian coordinate system as above). \({{\mathcal {F}}}\) and its inverse preserve the Schwartz space. The map \({\mathscr {B}}({{\mathbb {R}}}^3) \ni \Delta \mapsto 1_\Delta \in {{\mathfrak {B}}}(L^2({{\mathbb {R}}}^3,d^3x))\) is evidently a PVM in the written Hilbert space. As \({{\mathcal {F}}}^{-1} S_n\) is norm preserving, and when restricted to the dense subspace of Schwartz functions has a dense range, \( S_n^{-1} {{\mathcal {F}}}\, 1_\Delta \, {{\mathcal {F}}}^{-1} S_n|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})}\), extends to a bounded operator everywhere defined which is also a PVM. Identity (15) is an immediate consequence of (18) when \({{{\mathcal {E}}}}_3\) is identified with \({{{\mathcal {E}}}}_n\) (4). Let us prove (18). From (20), for \(\psi ,\psi '\in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), the Fubini and Tonelli theorems yield
where \(-n\cdot x= t\) and the integrals are interpreted in proper sense. Let us define
At this juncture, taking advantage of (8) and observing that the \({{\mathbb {M}}}\)-isometry invariance of the measures induced by the metric \(\textrm{d}\Sigma _{n,t}(x) = dh\Sigma _{n,t}(hx) \) entails, for \(h\in IO(1,3)_+\)
The found identity used in (21) and taking (7) into account leads to
Since \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is dense in \({{{\mathcal {H}}}}\) and the operators are bounded and everywhere defined, the found identity extends to the general case \(\psi ,\psi '\in {{{\mathcal {H}}}}\) ending the proof. \(\square \)
Definition 9
The family \(\{{{\textsf{Q}}}_{n,t}(\Delta )\}_{\Delta \in {\mathscr {B}}(\Sigma _{n,t})}\) constructed in Proposition 8 is the Newton–Wigner PVM of the massive Klein–Gordon particle in the reference frame n at time t. The collection \({{\textsf{Q}}}\) of all these PVMs when \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\) is the Newton–Wigner spatial localization observable.
Remark 10
-
(1)
In view of the \(IO(1,3)_+\) covariance and (10)
$$\begin{aligned} {{\textsf{Q}}}_{n,t}(\Delta +t) = U_t^{(n)\dagger } {{\textsf{Q}}}_{n,0}(\Delta ) U_t^{(n)}\,,\quad \forall t\in {{\mathbb {R}}}\,\,, \forall \Delta \in {\mathscr {B}}(\Sigma _{n,0})\,. \end{aligned}$$(22)In other words, the Newton–Wigner PVM at time t in n is the Heisenberg evolution of the one at time zero according to the time evolutor in the reference frame n.
-
(2)
The non-relativistic limit for a state \(\psi \in \mathcal{H}\), in a reference frame \(n\in {{\textsf{T}}}_+\), can be viewed as the requirement that \(|\psi (p)|\) vanishes outside a region where \(|\vec {p}_n|\) is strictly narrowed around m. It is easy to see from (12) that, in this situation, \(m\varphi _\psi \) tends to become a standard Schrödinger wavefunction for a free particle of mass m. The use of same type of states in (21) shows that \(\langle \psi | {{\textsf{Q}}}_{n,0}(\Delta )\psi \rangle \) tends to the probability of finding the particle in \(\Delta \) (at \(t=0\)) according to the standard non-relativistic position PVM on the said state \(\psi \).
-
(3)
There is, however, another regime where the Newton–Wigner PVM approximates the PVM of the classical position observable. It is when \(\psi \) is sharply narrowed around a value of the momentum \(p_0\). In that case, similarly to before, \(E(p_0)\varphi _\psi \) tends to become a standard Schrödinger wavefunction for a free particle of mass m and \(\langle \psi | {{\textsf{Q}}}_{n,0}(\Delta )\psi \rangle \) tends to the probability of finding the particle in \(\Delta \) (at \(t=0\)) according to the standard non-relativistic position PVM. \(\blacksquare \)
3.2 NW localization does not mean localized covariant wavefunctions: antilocality
I am in a position to illustrate an annoying fact which sharply distinguishes the relativistic and the non-relativistic theory. Newton–Wigner localization in a bounded set \(\Delta \subset \Sigma _{n,t}\) for a state \(\psi \) implies that the associated wavefunction \(\varphi _\psi \) is essentially supported also outside \(\Delta \) itself at time t.
Choose a reference frame n and a co-moving Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\) and wrote \(\vec {x}:= (x^1,x^2,x^3)\). Looking at (20), if \(\psi \in {{{\mathcal {H}}}}\),
Notice that \(\Psi _t \in {\mathscr {S}}({{\mathbb {R}}}^3)\) if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) where \({{\mathbb {R}}}^3\) identifies with \(\Sigma _{n,t}\). On account of (20), the action of \({{\textsf{Q}}}_{n,t}(\Delta )\) on \(\Psi \) is trivially the multiplication with \(1_{\Delta }(\vec {x})\). On the other hand, the definition of covariant wavefunction associated to a state (12) can be re-formulated in terms of \(\Psi \):
This definition is valid for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) as the original version (12) is. However, as indicated, it can be trivially extended to the general case \(\psi \in {{{\mathcal {H}}}}\), since the self-adjoint operator \((\overline{- \Delta + m^2})^{-1/4}\) is bounded and everywhere defined in \(L^2({{\mathbb {R}}}^3, d^3x)\). In that case, the covariant wavefunction satisfiesFootnote 3\(\varphi _\psi (t,\cdot ) \in L^2({{\mathbb {R}}}^2,d^3x)\). A crucial property known as antilocality [30, 35] of \((\overline{- \Delta + m^2})^{\alpha }\) plays a fundamental role in the rest of the paper.
Theorem 11
Let \(k \in {{\mathbb {N}}}\), \(m>0\), and suppose that \({{\mathbb {R}}}\ni \alpha \not \in {{\mathbb {Z}}}\). If both \(\Psi \in L^2({{\mathbb {R}}}^k, d^kx)\) and \((\overline{- \Delta + m^2})^{\alpha } \Psi \) vanish a.e. with respect to \(d^kx\) in an open non-empty set \(\Omega \subset {{\mathbb {R}}}^k\)—assuming \(\Psi \in D((\overline{- \Delta + m^2})^{\alpha })\) for \(\alpha >0\)—then \(\Psi =0\) in \( L^2({{\mathbb {R}}}^k, d^kx)\).
This theorem together with Eq.(24) permits to prove a well-known annoying fact regarding spatial localization according to NW: localized states do not correspond to localized covariant wavefunctions (item (2) below).
Proposition 12
Let us consider the Newton–Wigner localization observable \({{\textsf{Q}}}\) of a massive Klein–Gordon particle. The following facts are true for given \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\).
-
(1)
\({{\textsf{Q}}}_{n,t}(\Delta )=0\) if and only if \(\Delta \) has zero measure with respect to \(\textrm{d}\Sigma _{n,t}\).
-
(2)
Let \(\psi \in {{{\mathcal {H}}}}\setminus \{0\}\) be localized in a spatial region \(\Delta \in {\mathscr {B}}(\Sigma _{n,t})\), i.e.,
$$\begin{aligned} {{\textsf{Q}}}_{n,t}(\Delta )\psi = \psi \,. \end{aligned}$$If \(\Delta \) is not dense (in particular, if \(\Delta \) is bounded) then \(\varphi _\psi (t, \cdot )\) cannot vanish a.e. in every fixed non-empty open subset of \(\Sigma _{n,t}\setminus \Delta \).
Proof
(1) is obvious since, under unitary equivalence, \({{\textsf{Q}}}_{n,t}(\Delta )\) is the multiplicative operator \(1_{\Delta }\). Let us pass to (2). Suppose that \(\Delta \) is not dense and consider an open non-empty set \(\Omega \subset \Sigma _{n,t}\setminus \Delta \). \({{\textsf{Q}}}_{n,t}(\Delta )\psi = \psi \) is equivalent to \(1_{\Delta }\Psi _t = \Psi _t\) a.e. with respect to \(\textrm{d}\Sigma _{n,t}\), in particular, \(\Psi _t(\vec {x})=0\) a.e. in \(\Omega \). If also \(\varphi _\psi (t, \vec {x})= (\overline{- \Delta + m^2})^{-1/4}\Psi _t(\vec {x})=0\) a.e. for \(\vec {x} \in \Omega \), Theorem 11 applied to \(\Psi = \Psi _t\) for \(\alpha =-1/4\) would imply \( \Psi _t=0\), namely \(\psi =0\). This is impossible because \(||\psi || \ne 0\). \(\square \)
3.3 The Newton–Wigner position self-adjoint operator
I pass to define the Newton–Wigner position self-adjoint operators. Given a reference frame \(n\in {{\textsf{T}}}_+\), choose a co-moving Minkowskian coordinate system \(t:=x^0,x^1,x^3,x^3\). Following [31, 38], I define the Newton–Wigner position self-adjoint operators in n associated to a co-moving Minkowskian coordinate system with coordinates \((t:=x^0,x^1,x^3,x^3)\),
where the integration is the standard one according to a PVM (see, e.g., [29]).
Proposition 13
The Newton–Wigner position self-adjoint operators (25) satisfy the following.
-
(1)
\(\sigma ( N_{n,t}^\alpha ) = \sigma _c( N_{n,t}^\alpha ) = {{\mathbb {R}}}\) for every \(\alpha = 0,1,2,3\).
-
(2)
It holds \(D( N_{n,t}^\alpha ) \supset {{{\mathcal {S}}}}(\mathcal{H})\) and more strongly
$$\begin{aligned} N_{n,t}^\alpha ( {{{\mathcal {S}}}}({{{\mathcal {H}}}})) \subset {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,, \end{aligned}$$(26)and \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is a core for all those operators.
-
(3)
The Heisenberg commutation relations hold, where \(k,h=1,2,3\):
$$\begin{aligned} {[} N_{n,t}^k, N_{n,t}^h]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =[ P_{n h}, P_{n k}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =0\,, \qquad [ N_{n,t}^k, P_{n h}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} = i\delta ^k_hI|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} \nonumber \\ \end{aligned}$$(27)so that, in particular, the statement of the Heisenberg principle holds for \(h=1,2,3\):
$$\begin{aligned} \Delta _\psi N^k_{n,t} \Delta _\psi P_{nk} \ge 1/2\,, \quad \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,. \end{aligned}$$(28) -
(4)
The Heisenberg time evolution relation is valid:
$$\begin{aligned} U^{(n)\dagger }_t N_{n,0}^kU^{(n)}_t\psi = N_{n,t}^k\psi = N_{n,0}^k\psi + t\frac{P_{nk }}{P_{n0}}\psi \quad \text{ for } \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \text{ and } k=1,2,3\,. \nonumber \\ \end{aligned}$$(29) -
(5)
\(IO(1,3)_+\) covariance relations are true, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(IO(1,3)_+ \ni h= (\Lambda _h, a_h)\),
$$\begin{aligned} U_h N_{n,t}^\alpha U_h^{-1} \psi = (\Lambda ^{-1}_h)^\alpha _\beta ( N^{\beta }_{\Lambda _h n, t_{h}} - a_h^\beta I)\psi , \quad \forall h \in IO(1,3)_+\,. \nonumber \\ \end{aligned}$$(30)
Proof
See “Appendix A”. \(\square \)
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), property (4) implies that the maps \({{\mathbb {R}}}\ni t \mapsto \langle \psi | N^{\alpha }_{n,t} \psi \rangle \in {{\mathbb {R}}}^4 \equiv {{\mathbb {M}}}\), \(\alpha =0,1,2,3\) is the coordinate description of a timelike curve, i.e., the time evolution of a point in the rest space of n with speed that is strictly less than the light speed. In fact, the following corollary holds which strongly relies on the overall initial hypothesis \(m>0\). That is a sort of Ehrenfest theorem for the position of a massive free Klein–Gordon particle.
Corollary 14
Let \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) satisfy \(||\psi ||=1\). The expectation values of the Newton–Wigner position self-adjoint operators \( N^1_{n,t}, N^2_{n,t}, N^3_{n,t}\) (of a reference frame \(n\in {{\textsf{T}}}_+\) with a co-moving Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\)) describe a timelike worldline since
Proof
See “Appendix A”. \(\square \)
I stress that the found result together with the covariance properties stated in Propositions 8 and 13 suggests that the Newton–Wigner position localization observable possesses important physically sound features which should be preserved in any improvement of this sort of formalization. On the other hand, some substantial improvement is also necessary because, as we shall see shortly, the Newton–Wigner localization also suffers for physically insurmountable issues related to causality.
4 Problems with spatial localization
This section is devoted to examine the consequences on the Newton–Wigner position localization observable of an important general result by Hegerfeldt [20, 21] that, at the end of the play, rules out it. The analysis only concerns the issue (I1) presented in the introduction and extends to more general notions of spatial localization based on POVMs rather than PVM.
I stress that I will stick to the basic version of Hegerfeldt’s result. A modern formulation, which improves original Hegerfeldt’s ideas, appears in [8, 9].
4.1 Castrigiano’s causality requirement
Suppose that an one-particle Klein–Gordon pure state represented by \(\psi \in {{{\mathcal {H}}}}\), with \(||\psi ||=1\), defines a family \(\mu ^\psi \) of probability measures \(\mu ^\psi _{n,t}: {\mathscr {L}}(\Sigma _{t,n}) \rightarrow [0,1]\)—where \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\)—such that \(\mu ^\psi _{n,t}(\Delta )\) represents the probability of detecting the particle in \(\Delta \subset \Sigma _{n,t}\). I will call this collection a family of spatial localization probability measures associated to the state \(\psi \). How this association is implemented will be discussed later.
A physically meaningful requirement on families of spatial localizations was explicitly introduced by CastrigianoFootnote 4 in [8] and therein deeply analyzed in the case of particles with spin (within the more elaborated notion of causal system). Castrigiano’s requirement was actually formulated in terms of POVMs I will introduce later. Here I adopt a definition in terms of families of probability measures which is equivalent to Castrigiano’s one as soon as one passes to deal with POVMs.
The next definition illustrates Castrigiano’s causality requirement corresponding to item (b) in the definition below. The notion of causal time evolution presented in (a) was also introduced by Castrigiano. I stress that the distinction between of (a) and (b) is just functional to this study, though the validity of (a) is an evident consequence of (b)Footnote 5 which is the causality condition introduced in [8].
Definition 15
Let
be the family of spatial localization probability measures of a pure state represented by \(\psi \in {{{\mathcal {H}}}}\) with \(||\psi ||=1\).
-
(a)
A given \(n\in {{\textsf{T}}}_+\) defines a causal time evolution if, for every \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\),
$$\begin{aligned} \mu ^\psi _{n,t}(\Delta ) \le \mu ^\psi _{n,t'}(\Delta ') \quad \forall t'\in {{\mathbb {R}}}\,. \end{aligned}$$(32)where \(\Delta ':= \left( J^+(\Delta ) \cup J^-(\Delta ) \right) \cap \Sigma _{n,t'}\).
-
(b)
(Castrigiano’s causality requirement) The full family \(\mu ^\psi \) is causal if, for every \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), it holds
$$\begin{aligned} \mu ^\psi _{n,t}(\Delta ) \le \mu ^\psi _{n',t'}(\Delta ') \quad \forall n, n'\in {{\textsf{T}}}_+ \,, \forall t, t'\in {{\mathbb {R}}}\,, \end{aligned}$$(33)where \(\Delta ':= \left( J^+(\Delta ) \cup J^-(\Delta ) \right) \cap \Sigma _{n',t'}\).
Remark 16
-
(1)
The reason why I passed from \({\mathscr {B}}(\Sigma _{t,n})\) to \({\mathscr {L}}(\Sigma _{t,n})\) is that, if \(\Delta \in {\mathscr {B}}(\Sigma _{t,n})\) then it may happen that \(\Delta ' \not \in {\mathscr {B}}(\Sigma _{t',n'})\). Vice versa, if \(\Delta \subset \Sigma _{n,t}\) (non-necessarily Lebesgue measurable!), then \(\Delta ' \in {\mathscr {L}}(\Sigma _{t',n'})\) for every \(n'\ne n\) and \(t,t'\in {{\mathbb {R}}}\) as established in Lemma 16 [8].
-
(2)
Evidently, the validity of (b) implies that (a) holds for every choice of \(n\in {{\textsf{T}}}_+\). However, if (a) is true for all \(n\in {{\textsf{T}}}_+\), (b) can be false in principle.
-
(3)
The definition of causal family of spatial localizations is symmetric under time reversal, i.e., it also consider \(J^-(\Delta )\). This is because, if interpreting the probability as a density of particles, the particles which reached \(\Delta \) at time t must have passed through \(J^-(\Delta )\cap \Sigma _{n',t'}\) for every rest space \(\Sigma _{n',t'}\) in the past of \(\Delta \). There are intermediate situations where the intersection of \(\Sigma _{n',t'}\) and \(\Sigma _{n,t}\) includes \(\Delta \) but they can be treated separately by dividing the particles into two cases. \(\blacksquare \)
4.2 Justification of the causal condition in the special case of sharp localization
The condition (b) above seems physically reasonable. However, it is not obvious how to justify it within the framework of this work (and the analogous ones), as everything should be justified within the framework of the issue (I1) disregarding (I2). In other words, I should not to refer to any issue concerning post-measurement states, but I have to stick to a unique given family \(\mu ^\psi \). I can at most perform one position measurement because, after a measurement, referred to the state \(\psi \) and the family \(\mu ^\psi \), the state changesFootnote 6\(\psi \rightarrow \psi '\) and the family \(\mu ^\psi \) changes accordingly \(\mu ^\psi \rightarrow \mu ^{\psi '}\), into a way I cannot control without a precise choice of the post-measurement state. Instead, Definition 15 considers a unique family \(\mu ^\psi \).
There is a case, however, where a justification of the requirements in the above definition is sufficiently easy even referring to a unique family \(\mu ^\psi \) (one measurement procedure only). Let us illustrate how the failure of condition (a) (thus (b)) for a choice of \(n\in {{\textsf{T}}}_+\) would permit superluminal transmission of information in the special case where there are states strictly localized at time t in some bounded regions \(\Delta \). In other words, \(\mu ^\psi _{n,0}(\Delta )=1\) in the reference frame \(n\in {{\textsf{T}}}_+\). This justification does not need to tackle the issue of the post-measurement state.
Consider two types of Klein Gordon particles with masses \(m_1 \ne m_2\), respectively, and collect, at \(t=0\), a large number of these particles (of the two types) in a box at rest in \(\Sigma _{n,0}\). We can image the box as the bounded region \(\Delta \subset \Sigma _{n,0}\). I assume that it is possible to open the box only for the mass \(m_1\) or mass \(m_2\) particles with some sort of filter. Next the procedure is
-
(1)
I make a decision about which type of particles (\(m_1\) or \(m_2\)) to free from \(\Delta \) at time \(t=0\) and I free it;
-
(2)
somebody detects the particles in \(\Sigma _{n,t}\) at time \(t>0\) and observes the value of the mass.
If (32) failed, a particle could be detected in the region \(\Delta ' \subset \Sigma _{n,t}\) with \(\Delta ' \cap J^+(\Delta )= \varnothing \), and this procedure would manage to transmit the information about my mass choice made in the spatial region \(\Delta \) at time \(t=0\) outside the causal future of this event!
The crucial point in the above discussion is that some states are at disposal whose probability measure at \(t=0\) is zero outside the bounded region \(\Delta \).
Very unfortunately, as I will discuss shortly, sharply localized position probabilities are ruled out by the Hegerfeldt theorem.
The above justification of the causality condition (b) which only relies on (I1) does not seem to be that easy to re-propose if referring to families \(\mu ^\psi \) which are not sharply localized (see Sect. 5.3). In this case, as discussed in the rest of this work, the position observable is described in terms of a POVM instead a PVM. In principle, in the absence of sharply localized states, one may try to use again an analogous argument where, at time \(t=0\), two types of bosons stay in a box with a certain very large probability. Opening the box for only one kind of boson should be formalized in terms of suitable quantum operations [6], not necessarily trace preserving, which define the quantum states of the two types of particles at time \(t>0\). Here, precise theoretical choices seem to be necessary and the elementary setting of (I1) does not seem to be sufficient.
This matter deserves further attention, but in this paper I will be content with assuming Castrigiano’s causality requirement and the consequent notion of causal time evolution as natural ideas.
4.3 Spatial localization in terms of POVMs
As is known, (see, e.g., [29]), if \(A:{{{\mathcal {H}}}} \rightarrow {{{\mathcal {H}}}}\), then \(A\ge 0\) means \(\langle \psi |A\psi \rangle \ge 0\) for all \(\psi \in {{{\mathcal {H}}}}\). This requirement for A is equivalent to \(A=A^\dagger \in {{\mathfrak {B}}}({{{\mathcal {H}}}})\) and \(\sigma (A) \subset [0,+\infty )\). Finally, if also \(B: {{{\mathcal {H}}}}\rightarrow {{{\mathcal {H}}}}\), then \(A\ge B\) means \(A-B \ge 0\).
An effect (see [6] for a modern up-to-date textbook on the subject) is a bounded operator \({{\textsf{E}}}\in {{\mathfrak {B}}}(\mathcal{H})\), for a Hilbert space \({{{\mathcal {H}}}}\), such that \(0\le {{\textsf{E}}}\le I\). \({{\mathfrak {E}}}({{{\mathcal {H}}}})\) will indicate henceforth the set of effects in \({{{\mathcal {H}}}}\). An orthogonal projector is an effect but there are effects which are not orthogonal projectors.
A (normalized) Positive Operator Valued Measure (POVM) is a map
where \(\Sigma (X)\) is a \(\sigma \)-algebra on X, such that the function is (see Def. 4.5 in [6] and the remarks under that definition)
-
(a)
normalized: \({{\textsf{E}}}(X) =I\);
-
(b)
\(\sigma \)-additive: \(\sum _{n \in {{\mathbb {N}}}} {{\textsf{E}}}(\Delta _n) = {{\textsf{E}}}(\cup _{n\in {{\mathbb {N}}}}\Delta _N)\) when \(\Delta _n\cap \Delta _m = \varnothing \) for \(n\ne m\) and the sum is understood in the weak (or equivalently strong) operator topology.
Notice that (a) and (b) imply, in particular, that \({{\textsf{E}}}(\varnothing )=0\). Furthermore (b) can be equivalently replaced by the requirement that \(\Sigma (X) \ni \Delta \mapsto \langle \psi | {{\textsf{E}}}(\Delta ) \psi ' \rangle \) is a complex measure (with finite total variation) for every \(\psi ,\psi ' \in {{{\mathcal {H}}}}\).
Remark 17
-
(1)
It is clear that a PVM is a specific case of POVM where the positive operators \({{\textsf{E}}}(\Delta )\) are orthogonal projectors.
-
(2)
A POVM does not satisfy in general \([{{\textsf{E}}}(\Delta ),{{\textsf{E}}}(\Delta ')]=0\) for \(\Delta \cap \Delta ' = \varnothing \) contrarily to what happens for a PVM.
-
(3)
The one-to-one link between self-adjoint operators and PVMs does not hold in case of POVMs. Something remains, however, since under some technical hypotheses a POVM is uniquely determined by a symmetric operator, in terms of the first moment of the POVM, as I will briefly discuss later. This fact, the failure of the hypotheses for that property, will play some role in this paper. \(\blacksquare \)
The general notion of observable, in the modern approaches to Quantum Theory, is a (normalized) POVM on a \(\sigma \)-algebra \(\Sigma (X)\) and taking values in \({{\mathfrak {B}}}({{{\mathcal {H}}}})\), where \({{{\mathcal {H}}}}\) is the Hilbert space of the considered quantum system:
-
(1)
The elements \(\Delta \in \Sigma (X)\) are the outcomes of measurements and,
-
(2)
if \(\rho \) is a generally mixed state—a trace class, unit-trace positive operator in \({{\mathfrak {B}}}({{{\mathcal {H}}}})\); \(\Sigma (X) \ni \Delta \mapsto tr(\rho {{\textsf{E}}}(\Delta ))\) is the probability measure associated to these outcomes. It boils down to \(\Sigma (X) \ni \Delta \mapsto \langle \psi |{{\textsf{E}}}(\Delta ) \psi \rangle \) in case of a pure state represented by the unit vector \(\psi \in {{{\mathcal {H}}}}\).
Definition 18
A relativistic spatial localization observable for a Klein–Gordon particle of mass \(m>0\) described in the (complex, separable) Hilbert space \({{{\mathcal {H}}}}\) is defined as a family of normalized POVMs \({{\textsf{E}}}_{n,t}: {\mathscr {L}}(\Sigma _{n,t}) \rightarrow {{\mathfrak {E}}}({{{\mathcal {H}}}})\), where \(n\in {{\textsf{T}}}_+\) and \(t\in {{\mathbb {R}}}\), that is covariant with respect to the strongly continuous unitary representation U of \(IO(1,3)_+\) (7):
A very detailed technical analysis of the notion above (called Poincaré covariant POL therein) appears in Sects. 6 and 7 of [8] referring to a general system and establishing some extension and uniqueness properties from POVMs covariant under the Euclidean group to POVMs covariant under the full \(IO(1,3)_+\) group.
The use of POVMs defined on \({\mathscr {L}}(\Sigma _{n,t})\) is mandatory due to Remark 16.
With the same elementary procedure to complete positive measures, a POVM \({{\textsf{E}}}\) defined on \({\mathscr {B}}(\Sigma _{n,t})\) uniquely extends to a completion: another POVM \({\overline{{{\textsf{E}}}}}\), on a larger \(\sigma \)-algebra \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\) made of the unions of the elements of \({\mathscr {B}}(\Sigma _{n,t})\) with the subsets of the zero-\({{\textsf{E}}}\)-measure sets,
Exactly as in standard measure theory, \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\) is characterized by the fact that it is the smallest \(\sigma \)-algebra including \({\mathscr {B}}(\Sigma _{n,t})\) and equipped with an extension \({\overline{{{\textsf{E}}}}}\) of \({{\textsf{E}}}\) such that all subsets of zero-\({\overline{{{\textsf{E}}}}}\)-measure sets in \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\) belong to \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\).
Trivially, the outlined procedure extends a POVM which is a PVM to a completion that is a PVM as well. In particular, the completion of the previously discussed Newton–Wigner PVM turns out to be defined on \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{Q}}}_{n,t}}= {\mathscr {L}}(\Sigma _{n,t})\) as a consequence of (1) Proposition 12 and elementary properties of the Lebesgue measure: \({\mathscr {L}}(\Sigma _{n,t})\ni \Delta \mapsto \overline{{{\textsf{Q}}}_{n,t}}(\Delta ) \in {{\mathfrak {E}}}({{{\mathcal {H}}}})\). This completion still satisfies the \(IO(1,3)_+\) covariance and all the properties established in the previous section as one immediately proves. In the rest of the paper, I will simply write \({{\textsf{Q}}}_{n,t}(\Delta )\) in place of \(\overline{{{\textsf{Q}}}_{n,t}}(\Delta )\) when \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\).
4.4 Troubles with Newton–Wigner and sharply localized states: the Hegerfeldt theorem
Hegerfeldt [21] proved the following quite devastating theorem against the Newton–Wigner notion of localization, in particular. I reformulate the result established in [21] into the language of Definition 15 and explicitly for a massive Klein–Gordon real spinless particle.
Theorem 19
(Hegerfeldt) Consider a spatial localization POVM of a massive Klein–Gordon particle according to Def. 18. Suppose that there are \(\psi \in {{{\mathcal {H}}}}\) with \(||\psi ||=1\) and \(e\in \Sigma _{n_e,t_e}\) such that the probability to find the particle outside the balls \(B_r(e) \subset \Sigma _{n_e,t_e}\) with common center e and variable radii \(r>0\) satisfies
Then \(n_e\) cannot define a causal time evolution of the family of probability measures \(\mu ^{\psi }_{n,t}:= \langle \psi | {{\textsf{E}}}_{n,t}(\Delta ) \psi \rangle \) according to condition (a) in Def. 15.
A crucial corollary follows against the Newton–Wigner notion of spatial localization.
Corollary 20
The (completion of the) Newton–Wigner spatial localization observable does not satisfy Castrigiano’s causality condition, because (a) in Def. 15 fails for every choice of \(n\in {{\textsf{T}}}_+\).
Proof
Arbitrarily fix \(e\in \Sigma _{n_e,t_e}\), choose \(R>0\) and consider the orthogonal projector \({{\textsf{Q}}}_{n_e,t_e}(B_R(e))\). It holds \({{\textsf{Q}}}_{n_e,t_e}(B_R(e)) \ne 0\) due to (1) in Proposition 12, since an open has strictly positive measure \(\textrm{d}\Sigma _{n_e,t_e}\). Therefore, there exists \(\psi = {{\textsf{Q}}}_{n_e,t_e}(B_R(e))\psi \) with \(||\psi ||=1\). Evidently \(\langle \psi | {{\textsf{Q}}}_{n_e,t_e}(\Sigma _{n_e,t_e} {\setminus } B_r(e)) \psi \rangle =0\) if \(r>R\) since
\(\psi \) satisfies the hypotheses of Hegerfeldt’s theorem with respect the family of balls \(B_r(e)\). Arbitrariness of \(n_e \in {{\textsf{T}}}_+\) concludes the proof. \(\square \)
An interesting paper by Ruijsenaars [34] presents some explicit numerical estimates of the probabilities of recording a violation of causality through measurements of the Newton–Wigner observable for a scalar Klein–Gordon massive particle.
It is evident that, on account of the corollary, Physics rules out the Newton–Wigner notion of localization because it does not satisfy a basic requirement about causality, in particular, taking Sect. 4.2into account. However, this is very disappointing because the Newton–Wigner position operator shows some natural and quite appealing features, as previously illustrated in Proposition 13 and its Corollary 14. This inconclusive asymmetry is very annoying and is certainly a reason why Newton Wigner’s notion of localization is still a subject of discussion in the literature. In the rest of the paper will see how it is possible to keep the good things (the position operator) and get rid of the bad ones (the PVM).
Remark 21
-
(1)
There are other, even more severe, problems with the Newton–Wigner notion of spatial localization and causality when one analyzes it on the ground of the issue (I2) of the introduction, by assuming the Lüders’ projection postulate about the post-measurement state.
-
(2)
The Newton–Wigner notion of spatial localization is acausal not only with respect to time translations but equally regarding boosts. This so-called frame dependence of Newton–Wigner localization has been observed already in [37] and it is still studied in the literature, e.g., [15]. It is obvious that any notion of spatial localization in terms of POVMs should meet the requirement of frame independence.
-
(3)
It is interesting to notice that the example of the rejection of the Newton–Wigner observable shows how the idea that every PVM/self-adjoint operator in the Hilbert space of a quantum system must be an observable is definitely untenable. However, to author’s knowledge this is the first time that, in quantum mechanics, the rejection of a self-adjoint operator as an observable in quantum mechanics is due to local causality and not to the existence of a gauge group or a superselection rule.
-
(4)
The above version of the Hegerfeldt theorem is the classic one, it explicitly refers to the Klein–Gordon particle and can be immediately extended to particles with spin. Actually it is not necessary that full covariance with respect to our representation of \(IO(1,3)_+\) holds. There are more abstract versions of this theorem that refer to abstract POVMs and rely only on (a) positivity of the self-adjoint generator of temporal translations and (b) covariance with respect to four translations. See, in particular, Theorem B1Footnote 7 in [1]. A throughout analysis of the interplay of spatial localization and Hamiltonian positivity appears in Sects. 4 and 5 of [8]. \(\blacksquare \)
5 The spatial localization observable proposed by Terno
In [8], Castrigiano proved that for spin 1/2 it is possible to define a spatial localization observable different from the Newton–Wigner one which satisfies the causality requirement (b) of Def. 15. That observable is a PVM if the positivity assumption on the Hamiltonian evolutor is not imposed and becomes a POVM when restricting to the subspace of positive energy. Unfortunately, that construction does not work for scalar Klein–Gordon particles as discussed in Sect. 23 of [8].
5.1 Terno’s POVM: the heuristic definition from QFT
Terno [36] introduced a position localization POVM starting from elementary notions of free QFT in Minkowski spacetime. Though that notion was also extended to photons in [36], here I stick to the case of a real scalar massive Klein–Gordon field.
I review the definition of that POVM in the formal language of theoretical physics of QFT first. Later I will translate it into a more mathematically rigorous setting. I start from the stress energy operator of QFT. Let
be the coordinate representation of the normally ordered stress-energy tensor operator in the symmetric Fock space \({{\mathfrak {F}}}_+({{{\mathcal {H}}}})\) of the real Klein–Gordon field operator \({\hat{\phi }}\) with mass \(m>0\). Referring to a Minkowski coordinate system co-moving with \(n\in {{\textsf{T}}}_+\), if \(\Delta \subset \Sigma _{n,t}\), define
where \(P_1: {{\mathfrak {F}}}_+({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\) is the orthogonal projector onto the one-particle space of the symmetric Fock space \({{\mathfrak {F}}}_+({{{\mathcal {H}}}})\) constructed upon the Minkowski vacuum state with \({{{\mathcal {H}}}}\) as the one-particle subspace. Actually, the definition in [36] uses the total Hamiltonian in the Fock space and \(P_1\) is swapped with the inverse square root of the said Hamiltonian, but that definition is formally equivalent to that above.
Formally speaking, without paying attention to domains, as \(:{\hat{T}}_{\mu \nu }: (x) n^\mu n^\nu \) turns out to be positive, the integral is a positive operator so that \(0\le {{\textsf{A}}}_{n,t}(\Delta ) \le {{\textsf{A}}}_{n,t}(\Delta ')\) if \(\Delta \subset \Delta '\). The integral on the whole rest space amounts to
Hence, \(0\le {{\textsf{E}}}(\Delta )\le I\). \(\sigma \)-additivity with respect to \(\Delta \) is guaranteed by the very presence of the integration over \(\Delta \). As a matter of fact, barring mathematical details I will fix shortly, that is a (non-commutative) POVM.
A straightforward formal manipulation of the right-hand side of (35), yields also a natural \(IO(1,3)_+\) -covariance relation
The physical idea behind Terno’s definition should be clear: probabilistically speaking, the particle stays where the energy is. This idea was previously formulated in [2], where, however, no explicit POVM was constructed. The crucial normalization factors \(H_n^{-1/2}\) were explicitly introduced in [36].
5.2 Terno’s spatial localization observable
Expanding the quantum field in modes as usual, a straightforward computation starting from (35) yields, for \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) and \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\),
which I will assume to be the definition of the family of operators \({{\textsf{A}}}_{n,t}(\Delta )\), for \(n\in {{\textsf{T}}}_t\) and \(t\in {{\mathbb {R}}}\), on the domain \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\).
Theorem 22
Referring to a massive real Klein–Gordon particle, the family of operators \({{\textsf{A}}}_{n,t}(\Delta ): {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\) defined in (36) for \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) uniquely continuously extends to a POVM—we shall indicate with the same symbol—for every given pair n, t. The following further facts are valid.
-
(1)
The family is covariant with respect to the strongly continuous unitary representation U of \(IO(1,3)_+\) (7):
$$\begin{aligned} U_{h} {{\textsf{A}}}_{n,t}(\Delta ) U_{h}^{-1} = {{\textsf{A}}}_{\Lambda _h n,t_h}(h\Delta ) \,, \quad \forall \Delta \in {\mathscr {L}}(\Sigma _{n,t})\,, \quad \forall h \in IO(1,3)_+\,. \end{aligned}$$(37)and thus it defines a relativistic spatial localization observable.
-
(2)
Referring to the (Lebesgue-completion of the) Newton–Wigner spatial localization observable \({{\textsf{Q}}}_{n,t}\), the following identity is true
$$\begin{aligned} {{\textsf{A}}}_{t,n}(\Delta ) = {{\textsf{Q}}}_{t,n}(\Delta ) + \frac{1}{2}\left( \eta ^{\mu \nu }\frac{P_{n\mu }}{H_n} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{P_{n\nu }}{H_n} + \frac{m}{H_n} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{m}{H_n} \right) \end{aligned}$$(38)for every \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\). (The various everywhere-defined bounded composite operators \(P_n^\mu /H_{n}\) and \(m/H_n\) are defined in terms of the joint spectral measure of \(P^\mu \) and with standard spectral calculus.)
Proof
Let us prove (1) and (2). Fix \(n\in {{\textsf{T}}}_+\) and \(t\in {{\mathbb {R}}}\). If \(\psi ',\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and we indicate by \(B\psi \) the right-hand side of (36) and by C the right-had side of (38), a straightforward computation that takes (21) into account proves that \(\langle \psi '| B\psi \rangle = \langle \psi '|C \psi \rangle \). Since \(\psi '\) varies in a dense set, the found identity implies that \(B\psi = C\psi \) for all \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). As C is continuous and everywhere defined on \({{{\mathcal {H}}}}\), we conclude that the operator defined in (36) uniquely extends by continuity to the operator in (38). On the other hand, since the operators \({{\textsf{Q}}}_{n,t}(\Delta )\) define a PVM, the structure of the right-hand side of (38), which can be re-arranged to
defines a family of positive operators of \({{\mathfrak {B}}}({{{\mathcal {H}}}})\). Notice, in particular, that \( \frac{P_{n\nu }}{H_n}= \left( \frac{P_{n\nu }}{H_n}\right) ^\dagger \in {{\mathfrak {B}}}({{{\mathcal {H}}}})\) and \( \frac{m}{H_n}= \left( \frac{m}{H_n}\right) ^\dagger \in {{\mathfrak {B}}}(\mathcal{H})\). The family of operators in the right-hand side of (39), is also evidently weakly \(\sigma \)-additive in \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\). The constructed POVM is normalized because \({{\textsf{Q}}}_{n,t}\) is:
The proof of (37) is strictly analogous to the one of (18) or it can be established immediately from it by taking (38) into account and the obvious covariance properties of the operators \(P_{n\mu }\). \(\square \)
Definition 23
Referring to Theorem 22, we call each \({{\textsf{A}}}_{n,t}\) Terno’s spatial localization POVM in the reference frame \(n\in {{\textsf{T}}}_+\) at time \(t\in {{\mathbb {R}}}\). The family \({{\textsf{A}}}\) of POVMs \({{\textsf{A}}}_{n,t}\) will be named Terno’s spatial localization observable.
Remark 24
Contrarily to the case of the Newton–Wigner localization, covariance with respect to the spatial Euclidean subgroup is not sufficient to fix the structure of \( {{\textsf{A}}}_{n,t}\), since there are infinitely many POVMs with that covariance property with respect to a unitary strongly continuous representation of the Euclidean group [7]. \(\blacksquare \)
5.3 Almost localized states
The following proposition illustrates a fundamental difference between the notion of spatial localization by Newton–Wigner and the one by Terno: localized states in bounded regions are permitted by the former but are impossible for the latter. This implies, in particular, that the argument of Corollary 20—which ruled out the Newton–Wigner localization notion—cannot be directly applied to \({{\textsf{A}}}_{n,t}\). In [36], it is proved (exploiting an argument of [2]) that the spatial decay of the probability distribution arising from the POVM \({{\textsf{A}}}_{n,t}\) does not reach the bound sufficient to trigger Hegerfeld’s local-causality catastrophe. I will achieve that result indirectly, by establishing that the time evolution with respect to every \(n\in {{\textsf{T}}}_+\) is causal for the said POVM.
However, it is not the whole story. Indeed, the second statement of the next proposition shows that, for every (in particular, bounded) region \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) with non-empty interior, there are states which are arbitrary good approximations of states sharply localized in that region.
Proposition 25
Referring to the Terno spatial localization observable \({{\textsf{A}}}\), the following facts are true.
-
(1)
Suppose that \(\psi \in {{{\mathcal {H}}}}\) with \(||\psi ||=1\), \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) satisfy
$$\begin{aligned} \langle \psi |{{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle =1\,. \end{aligned}$$In that case \(\Delta \) is dense in \(\Sigma _{n,t}\). In particular, \(\Delta \) cannot be bounded.
-
(2)
For every given \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\) and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) with \(Int(\Delta )\ne \varnothing \), there is a sequence of vectors \(\{\psi _j\}_{j \in {{\mathbb {N}}}}\subset {{{\mathcal {H}}}}\) such that \(||\psi _j||=1\) and
$$\begin{aligned} \langle \psi _j| {{\textsf{A}}}_{n,t}(\Delta )\psi _j\rangle \rightarrow 1\,, \quad \text{ as } j\rightarrow +\infty . \end{aligned}$$ -
(3)
For every given \(n\in {{\textsf{T}}}_t\), \(t\in {{\mathbb {R}}}\) and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), if \(Int(\Delta )\ne \varnothing \), then \(||{{\textsf{A}}}_{n,t}(\Delta )||=1\).
Proof
(1) Define \(\Delta ':=\Sigma _{n,t}\setminus \Delta \). By additivity, \(\langle \psi |{{\textsf{A}}}_{n,t}(\Delta ') \psi \rangle =0\). From (39) and the fact that \({{\textsf{Q}}}_{nt}\) is a PVM, \(\langle \psi |{{\textsf{A}}}_{n,t}(\Delta ') \psi \rangle =0\) can be rephrased to
In particular, \({{\textsf{Q}}}_{n,t}(\Delta ')\psi =0\) and \({{\textsf{Q}}}_{n,t}(\Delta ')H_n^{-1}\psi =0\). Using the representation (23) of the Hilbert space vectors, these requirements can be restated to \(1_{\Delta '}(\vec {x}) \Psi _t(\vec {x})=0\) and \(1_{\Delta '}(\vec {x}) (\overline{-\Delta + m^2I})^{-1/2}\Psi _t(\vec {x})=0\). Hence \(\Psi _t(\vec {x})=0\) and \( (-\Delta + m^2I)^{-1/2}\Psi _t(\vec {x})=0\) a.e. on \(\Delta '\). If \(\Delta '\) includes an open non-empty set, Theorem 11 would imply that \(\Psi _t=0\) which is not permitted by hypothesis.
(2) It is evidently sufficient to prove it for the special case \(\Delta = B_R\subset \Sigma _{n,t}\) given by an open ball of finite radius \(R>0\). Indeed, if \(\Delta \) admits non-empty interior, then \(\Delta \supset B_R\) for some such ball and thus \(0\le \langle \psi | {{\textsf{A}}}_{n,t}(B_R) \psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle \le 1\) if \(||\psi ||=1\). A sequence of localizing states \(\psi _j\) for \(B_R\) is also a sequence of localizing states for \(\Delta \). Finally, we can always assume \(t=0\) without lack of generality as the reader can immediately prove using a trivial time translation and exploiting the covariance properties of \({{\textsf{A}}}\). So we prove the thesis for the ball \(B_R\). Consider a \(C^\infty \) function \(\chi \ge 0\) on \(\Sigma _{n,0}\) with \(supp(\chi ) \subset B_R\). Let us identify \(\Sigma _{n,0}\) with \({{\mathbb {R}}}^3\) with a co-moving Minkowski coordinate system of n whose spatial origin is the center of \(B_R\). If \(\vec {a} \in {{\mathbb {R}}}^3\) is a fixed non-vanishing vector and \(j\in {{\mathbb {N}}}\),
Notice that the \(L^2\) norm of these vectors does not depend on j and is \(||\chi ||_{L^2({{\mathbb {R}}}^3, d^3x)}\). We can always choose \(\chi \) in order that \(||{\hat{\chi }}_j||_{L^2({{\mathbb {R}}}^3, d^3k)}=1 = ||\chi ||_{L^2({{\mathbb {R}}}^3, d^3x)}\) for all \(j\in {{\mathbb {N}}}\). Finally, define the family of the unit vectors \(\psi _j \in {{{\mathcal {H}}}}\),
From (17),
decomposing \(\langle \psi | {{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle \) as in (38), we have that \(\langle \psi | {{\textsf{A}}}_{n,0}(\Delta ) \psi \rangle - \langle \psi _j | {{\textsf{Q}}}_{n,0} (B_R) \psi _j \rangle \rightarrow 0\) because
The proof of the limit above is postponed to “Appendix A”. This concludes the proof of (2), because \(\langle \psi _j | {{\textsf{Q}}}_{n,0}(B_R) \psi _j \rangle =1\) as said above.
(3) is an easy consequence of (2), \(0\le {{\textsf{A}}}_{n,t}(\Delta )= {{\textsf{A}}}_{n,t}(\Delta )^\dagger \le I\) and \(||{{\textsf{A}}}_{n,t}(\Delta )|| = \sup \{|\langle \psi | {{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle |\,|\, ||\psi ||=1\}\). \(\square \)
5.4 Interplay of the first-moment operator of \({{\textsf{A}}}\) and the NW position operator
I can now pass to introduce the first moment of Terno’s POVM, a symmetric operator. I will prove, in particular, that its closure coincides with the Newton–Wigner position operator, so that it preserves all the good properties of the Newton–Wigner position operator.
Theorem 26
Take \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), choose a co-moving Minkowski coordinate system \(x^0=t,x^1,x^2,x^3\). There is only one operator \(X^\mu _{n,t}: {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\), for every \(\mu := 0,1,2,3\), completely defined as the first moment of the POVM \({{\textsf{A}}}_{t,n}\):
The following facts are true.
-
(1)
\( X^\mu _{n,t}\) satisfies
$$\begin{aligned} \langle \psi | X^\mu _{n,t} \psi \rangle = \langle \psi | N^\mu _{n,t} \psi \rangle \quad \forall \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,, \end{aligned}$$(42)where \( N^\mu _{n,t}\) is the Newton–Wigner position operator, so that the further following facts are valid.
-
(a)
The identity holds
$$\begin{aligned} X^\mu _{n,t}= N^\mu _{n,t}|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})}\,. \end{aligned}$$(43) -
(b)
\(X^\mu _{n,t}\) is symmetric, essentially self-adjoint and its unique self-adjoint extension is \( N^k_{n,t}\) itself.
-
(c)
The Heisenberg commutation relations hold, where \(k,h=1,2,3\):
$$\begin{aligned} {[} X_{n,t}^k, X_{n,t}^h]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =[ P_{n h}, P_{n k}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =0\,, \qquad [ X_{n,t}^k, P_{n h}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} = i\delta ^k_hI|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})}\,. \end{aligned}$$(44) -
(d)
The \(IO(1,3)_+\) covariance relations are true, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(IO(1,3)_+ \ni h= (\Lambda _h, a_h)\),
$$\begin{aligned} U_h X_{n,t}^\alpha U_h^{-1} \psi = (\Lambda ^{-1}_h)^\alpha _\beta (X^{\beta }_{\Lambda _h n, t_{h}} - a_h^\beta I)\psi , \quad \forall h \in IO(1,3)_+\,. \end{aligned}$$(45) -
(d)
The Heisenberg time evolution relation is validFootnote 8:
$$\begin{aligned} U^{(n)\dagger }_t X_{n,0}^k U^{(n)}_t\psi = X_{n,t}^k\psi = X_{n,0}^k \psi + t\frac{P_{nk }}{P_{n0}}\psi \quad \text{ for } \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \text{ and } k=1,2,3\,. \nonumber \\ \end{aligned}$$(46) -
(e)
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(||\psi ||=1\), the first-moment operators define a timelike worldline because
$$\begin{aligned} \sum _{k=1}^3 \left( \frac{d}{dt} \langle \psi | X^{k}_{n,t} \psi \rangle \right) ^2 < 1 \,. \end{aligned}$$(47)
-
(a)
-
(2)
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) with \(||\psi ||=1\) and \(k=1,2,3\),
$$\begin{aligned} \int _{\Sigma _{n,t}} (x^k)^2 d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle = \langle \psi | ( N^k_{n,t})^2\psi \rangle + \left\langle \psi \left| \frac{(P_{n0})^2-(P_{nk})^2}{2(P_{n0})^4}\psi \right. \right\rangle \,. \end{aligned}$$(48)As a consequence, a corrected version of the Heisenberg inequality holds for \(k=1,2,3\) (restoring the physical constants):
$$\begin{aligned} \Delta _\psi X^k_{n,t} \Delta _\psi P_{nk} \ge \frac{\hbar }{2} \sqrt{1 + 2\Delta _\psi P_{n,k}^2 \left\langle \psi \left| \frac{(P_{n0})^2-(P_{nk})^2}{(P_{n0})^4}\psi \right. \right\rangle }\,, \quad \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,. \nonumber \\ \end{aligned}$$(49)where \(\Delta _\psi X^k_{n,t}\) is the standard deviation of the probability measure \({\mathscr {L}}(\Sigma _{n,t}) \ni \Delta \mapsto \langle \psi |{{\textsf{A}}}_{n,t}(\Delta )\psi \rangle \in [0,1]\).
Proof
It is clear that, if an operator \(X_{n,t}^\mu \) exists that satisfies (41), then it must be unique on its domain \(\mathcal{S}({{{\mathcal {H}}}})\). That is because, by polarization any other operator \(S: {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\) that satisfies that identity would have the same matrix elements \(\langle \psi '| S\psi \rangle = \langle \psi '| X_{n,t}^\mu \psi \rangle \) when \(\psi ,\psi ' \in \mathcal{S}({{{\mathcal {H}}}})\). Since this space is dense, we have \( S\psi = X_{n,t}^\mu \psi \). To conclude the proof of the initial statement in (1), it is therefore sufficient to show that (42) is valid. Properties (a)–(e) are then obvious consequences of the analogs for \( N_{n,t}^\mu \) and of the fact that \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is also invariant under U, \( N_{n,t}^\beta \), and \(P_{n\alpha }\). The proof of (42), taking (38) into account, just amounts to prove that
if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(k=1,2,3\). The case \(k=0\) is trivial since in that situation \(x^0=t\) can be extracted by the two integrals and the identity boils down to the trivial one \(\langle \psi | (H_n^{-1}(P_{n}^\mu P_{n\mu } + m^2 I) \psi \rangle =0\). Regarding the cases \(k=1,2,3\), taking advantage of the spectral decomposition of \( N^k_{n,t}\), the identity above can be rewritten
where we have also used the fact that \({{{\mathcal {S}}}}({{{\mathcal {H}}}}) \subset D( N^k_{n,t})\) and the former space is invariant under the self-adjoint bounded operators \(H_n^{-1}\) and \(H_n^{-1}P_{n\mu }\) as the reader immediately proves. The identity above can be rearranged to the equivalent form (remember that \(H_n= -P_{n0}\))
Representing the identity above in the Hilbert space \(L^2({{\mathbb {R}}}^3, d^3p)\) where \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is represented by \({\mathscr {S}}({{\mathbb {R}}}^3)\) itself, \(P_{n\mu }= p_\mu \cdot \), \(H_n= E_n(p) \cdot \) are multiplicative and, for \(\psi \in {\mathscr {S}}({{\mathbb {R}}}^3)\) we have \( N^{k}_{n,t}\psi = i\frac{\partial }{\partial p_k}\psi \), we see that the two commutators are multiplicative operators as well. Therefore, for instance \( \frac{P_{n\mu }}{H_n} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] = \frac{1}{2} \frac{P_{n\mu }}{H_n} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] + \frac{1}{2} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] \frac{P_{n\mu }}{H_n} = \frac{1}{2} \left[ N^k_{n,t}, \frac{P^2_{n\nu }}{H^2_n}\right] \) and similarly for the other addends. In summary, the identity we need to establish can be rearranged to
which is evidently true, because \( \eta ^{\mu \nu } P_{n\mu } P_{n\nu } +m^2I=0\) on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\), and it complete the proof of (1).
Let us pass to (2) and we prove (48). With the same procedure used to prove (1) and if \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\), we find through (38)
The second line can be re-arranged to
The first line vanishes, while the second can be explicitly computed by working in the space \(L^2({{\mathbb {R}}}^3, d^3p)\) exactly as we did for item (1) and it becomes
where \(p_0= -\sqrt{m^2 + \sum _{k=1}^3 p_k^2}\) and the operators \(p_\mu \) being multiplicative. The proof of (48) is over. To prove (49), observe that
By multiplying both sides with \((\Delta _\psi P_k)^2\) and taking advantage of the standard Heisenberg inequality, we get (49). \(\square \)
Remark 27
-
(1)
The first-moment operator can be formally written within the QFT setting of Sect. 5.1,
$$\begin{aligned} X^k_{n,0} = \frac{1}{\sqrt{H_n}} P_1 \int _{\Sigma _{n,0}} x^k:{\hat{T}}_{\mu \nu }:(x) n^\mu n^\nu \, d \Sigma _{nt}(x) P_1 \frac{1}{\sqrt{H_n}}\,. \end{aligned}$$The internal integral is nothing but the k-component of the boost generator in QFT evaluated at \(t=0\). The position operator obtained in that way coincides with the known Born-Infeld position operator as discussed in [3] and remarked in [36].
-
(2)
Item (2) is of mathematical interest. If the identity were
$$\begin{aligned} \int _{\Sigma _{nt}} (x^k)^2 d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle = \langle \psi | (X^k_{n,t})^2\psi \rangle \,, \end{aligned}$$since \(X^k_{n,t}\) is symmetric and (41) is true, one could apply a known theorem by Naimark about the decomposition of symmetric operators in terms of POVMs (see Theorem 23 in [11] and the discussion about it). On account of that theorem, the POVM that decomposes \(X^k_{n,t}\) according to (41) would be uniquely determined by its first moment \(X^k_{n,t}\), provided this operator be maximally symmetric on its domain, and it is our case since \(X^k_{n,t}\) is essentially self-adjoint. Along this argument one would conclude that \({{\textsf{A}}}_{nt}= {{\textsf{Q}}}_{nt}\), since the latter POVM (actually a PVM) decomposes \(\overline{X^k_{n,t}} = N^k_{n,t}\) (as in (41) on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\)) in view of the spectral theorem. In summary, the cumbersome addend to the right-hand side of (48) is responsible for the failure of \({{\textsf{A}}}_{nt}= {{\textsf{Q}}}_{nt}\).
-
(3)
Given a pure state represented by a unit vector \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), also the standard Heisenberg inequalities
$$\begin{aligned} \Delta _\psi N^k_{n,t} \Delta _\psi P_{nk} \ge \hbar /2\,, \end{aligned}$$are valid for \( N_{n,t}^k\) and \(P_{nk}\) in addition to (49), as a consequence of the canonical commutation relations (44). The point is that these relations refer to the physically wrong probability distribution, the one constructed out of the Newton–Wigner PVM \({{\textsf{Q}}}_{n,t}\) instead of the Terno POVM \({{\textsf{A}}}_{n,t}\). \(\blacksquare \)
6 Every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution for \({{\textsf{A}}}\)
This section is devoted to prove that every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution in Castrigiano’s sense, according to (a) in Definition 15, for every family \(\mu ^\psi \) constructed out of the POVMs \({{\textsf{A}}}\) and a pure state \(\psi \in {{{\mathcal {H}}}}\): \(\mu ^{\psi }_{n,t}(\Delta ):= \langle \psi |{{\textsf{A}}}_{n,t}(\Delta )\psi \rangle \).
Remark 28
There are other notions of spatial localization which are causal with respect to time evolution. The localization in terms of POVMs due to Petzold et al. [17, 18] and Henning, Wolf [23] are causal with respect to time evolution. The proof in [18] can be made rigorous by means of the mathematical approach developed in this section. \(\blacksquare \)
6.1 The heuristic idea of a conserved probability four-current
The technology I will exploit to prove that \({{\textsf{A}}}_{n,t}\) produces a family of probability measures that satisfies the requirement (a) in Definition 15 for every \(n\in {{\textsf{T}}}_+\) is based on a probability four-current associated to \(\langle \psi |{{\textsf{A}}}_{n,t}(\Delta )\psi \rangle \). As explicitly observed in [36], (I disregard here a number of mathematical details which will be fixed later)
where \(J^{\psi }_{n}\) satisfies a conservation equation \(\partial ^\mu J^{\psi }_{n \mu } =0\). The existence of such four current of probability was postulated in the general case in [25] and see also [17, 18, 23, 26] for the use of similar currents in relation to the causality problem for massive Klein Gordon particles. A similar current exists for Dirac and Weyl particles [8, 9]. Assuming that \(J^{\psi }_{n}\) is causal, the divergence theorem should imply the validity of the local-causality requirement when restricting to the family of t-parametrized rest spaces of a unique reference frame. I will prove that it is the case in full generality, referring to every Lebesgue set \(\Delta \). The extension to the full family of reference frames, i.e., the proof of the validity of (b) in Definition 15, is not so easy since \(J^{\psi }_{n}(x)\) itself depends on n and one has to compare \(\int _{\Delta } J^{\psi }_{n\mu }(x) n^\mu \textrm{d}\Sigma _{n,t}(x)\) and \(\int _{\Delta } J^{\psi }_{n'\mu }(x) {n'}^\mu \textrm{d}\Sigma _{n',t'}(x)\).
6.2 The probability current of the stress-energy tensor
The first step of the proof consists of explicitly writing down the current \(J^\psi _n\) [36] for the special case \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). As usual, I represent events by means of four-vectors \({{\mathbb {M}}}\ni e= o+ x(e)\) where \(\psi \in {\textsf{V}}\).
Directly from (36), one has that, if \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\)
where I introduced the coordinate representation of the stress-energy tensor of \(\Phi ^\psi _n\),
associated to the smooth complex Klein–Gordon field
Notice the further factor \(E^{-1/2}_n(p)\) when comparing with (12) which arises from the analogous factors in the right-hand side of (35). Let us fix a Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\) comoving with some \(n\in {{\textsf{T}}}_+\). Since the factor of \(e^{i p\cdot x} \) in the integrand stays in \({\mathscr {S}}({{\mathbb {R}}}^3)\), the function \({{\mathbb {R}}}^3 \ni \vec {x} \mapsto \Phi ^\psi _n(t,\vec {x})\) belongs to \({\mathscr {S}}({{\mathbb {R}}}^3)\) as well for every \(t\in {{\mathbb {R}}}\).
Definition 29
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), \(||\psi ||=1\) and \(n\in {{\textsf{T}}}_+\), the associated probability four-current of \({{\textsf{A}}}\) is the contravariant vector field \(J^\psi _n\) on \({{\mathbb {M}}}\) written in coordinates reads
where \((T^\psi _{\nu \mu })_n\) is defined in (51).
It is evident that, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), (50) yields
Proposition 30
If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), \(n\in {{\textsf{T}}}_+\), then \(J^{\psi }_{n}\) is either the zero vector or is causal and past-directed. More precisely:
-
(1)
there is an open dense set \({{\textsf{O}}}^\psi _{n}\subset {{\mathbb {M}}}\) where \(J^\psi _{n}\) is timelike and past-directed;
-
(2)
if \(e\in {{\mathbb {M}}}\setminus {{\textsf{O}}}^\psi _n\), then either \(J^\psi _{n}(e)=0\) or \(J^\psi _{n}(e)\) is lightlike and past-directed;
-
(3)
it holds \({{\textsf{O}}}^\psi _n =\{e \in {{\mathbb {M}}}\,|\; \Phi ^\psi _n(e) \ne 0\}\).
Proof
We need some preparatory identities and inequalities. Consider a Minkowskian coordinate system co-moving with n, so that \(n^\mu = \delta ^\mu _0\) and, if \(\Phi ^\psi _n = A_1+iA_2\) with \(A_i\) real, One can write
where, for \(j=1,2\),
At this juncture observe that, for \(j=1,2\),
Let us pass to prove (1). Define \({{\textsf{O}}}^{\psi }_n\) as the set of events where \(J^\psi _{n\mu }\) is timelike. Let us prove that the set \({{\textsf{O}}}^{\psi }_n\) is dense and open and the vectors in it are past-directed.
(Dense.) It is clear from the found inequality that, in particular, if \(\Phi ^\psi _n(e) \ne 0\) then \(J^\psi _{n\mu } = J^\psi _{1n\mu }+J^\psi _{2n\mu }\) is timelike so that \(e\in {{\textsf{O}}}^{\psi }_n\). If \(x\in {{\mathbb {M}}}\) and \(N \ni x\) is an open neighborhood of it, suppose that there is no \(e\in N\) where \(\Phi ^\psi _n(e)\ne 0\). In particular, \(\Phi ^\psi _n(e)= 0\) in the open spatial set \(\Sigma _{n,t(x)}\cap N\). As a consequence, the spatial derivatives of \(\Phi ^\psi _n\) also vanishes on \(\Sigma _{n,t(x)}\cap N\) and (55) produces \(-g(J^\psi _{jn}, J^\psi _{jn})= \frac{1}{4}(\partial _t A_j(e) )^4\). If the right-hand side vanished for all \(e\in \Sigma _{n,t(x)}\cap N\) and \(j=1,2\), we would have that \(\Phi ^\psi _n(t,\cdot )\) and \((\overline{-\Delta +m^2})^{1/2}\Phi ^\psi _n(t,\cdot ) = -i \partial _t \Phi ^\psi _n(t,\cdot )=0\) on that open set in \(\Sigma _{n,t(x)}\). On account of Theorem 11, we would have \(\Phi ^\psi _n(t,\cdot )=0\) and thus \(\psi =0\) by inverting (52) and this is not allowed by hypothesis. We conclude that either \(\Phi ^\psi _n(t,e) \ne 0\) for some \(e\in \Sigma _{n,t(x)}\cap N\) or \(\Phi ^\psi _n(t,e) = 0\) for all \(e\in \Sigma _{n,t(x)}\cap N\), but \(\partial _t\Phi ^\psi _n(t,e) \ne 0\) for some \(e\in \Sigma _{n,t(x)}\cap N\). In both cases, (55) implies that \(J^\psi _{n}\) is timelike somewhere in the neighborhood N of x. We have proved that the set \({{\textsf{O}}}^\psi _n\) where \(J^\psi _n\) is timelike is dense.
(Open.) \({{\textsf{O}}}^\psi _n\) is also the preimage of an open set (the open future cone) according to a continuous map and thus it is open as well.
(Past directed.) Since n is future-directed and \(J^\psi _{jn} \cdot n = J^\psi _{jn0} \ge 0\), we also have that \(J^\psi _n\) is past-directed when it does not vanish.
(2) Consider \(e\in {{\mathbb {M}}}\setminus {{\textsf{O}}}^\psi _n\), namely \(J^\psi _n(e)\) is not timelike. Since \(J^\psi _{n} = J^\psi _{1n}+J^\psi _{2n}\) we have
Notice that all scalar products taking place on the right-hand side above are non-positive: the first two because of (55) and the last one because the two vectors are the limit of past directed timelike vectors for (1). Since the left-hand side is zero by hypothesis, we have the following two possibilities. \(J^\psi _n(e)\) vanishes (if both \(J^\psi _{1n}\) and \(J^\psi _{2n}\) vanish) or it is light like (if one of the two vanishes and the other is lightlike or if both are lightlike and parallel). In all these cases both \(A_1\) and \(A_2\) vanish on account of (55) where \(m>0\), so that \(\Phi ^\psi _n(e)=0\) as well. To conclude, observe that if \(J^\psi _n\) is lightlike, then it must be past-directed by continuity because \({{\textsf{O}}}^\psi _n\) is dense and the vectors in that set are past-directed. The proof of (3) has been given while establishing (1) and (2). \(\square \)
6.3 Every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution for \({{\textsf{A}}}\)
First of all, observe that if \(D\subset \Sigma _{n,t_1}\) is an open ball, then \(J^\pm (D)\) are open as well as it arises per direct inspection. This immediately implies that \(J^\pm (\Delta _1)\) are open if \(\Delta _1 \subset \Sigma _{n,t_1}\) is open and non-empty. As a consequence, when \(\Delta _1 \subset \Sigma _{n,t_1}\) is open, the intersections \(J^\pm (\Delta _1) \cap \Sigma _{n',t'}\) are open as well in the relative topology. I will use this fact several times in the rest of the paper.
Lemma 31
Consider the spatial localization observable \({{\textsf{A}}}\). Take \(n\in {{\textsf{T}}}_+\) and \(t_1,t_2 \in {{\mathbb {R}}}\) with \(t_2 \ne t_1\). Let \(\Delta _1 \subset \Sigma _{n,t_1}\) be a finite union of non-empty open balls with finite radius, and let \(\Delta _2:= (J^+(\Delta _1) \cup J^-(\Delta _1))\cap \Sigma _{n,t_2}\) be the corresponding open set in \(\Sigma _{n,t_2}\). Then
is valid for every \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) with \(||\psi ||=1\).
Proof
As a first case, we assume that \(\Delta _1 \subset \Sigma _{n,t_1}\) is an open ball of finite radius, so that \(\Delta _2\) in \(\Sigma _{n,t_2}\) is an analogous open set in \(\Sigma _{n,t_2}\). Let us suppose \(t_2>t_1\) (the other case is analogous) and consider \(B \subset {{\mathbb {M}}}\) whose boundary is made of the two bases \(\Delta _1\), \(\Delta _2\), and the portion L of \(\partial J^+(\Delta _1)\) between them. B is a manifold with boundary and we can use the Stokes–Poincaré theorem for the 3-formsFootnote 9
associated to the current \(J^\psi _n\) for the considered \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). We have chosen a Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\) comoving with n to write down the components of \(\nu ^\psi _n\) as above. With the choices above, the integral of the form on \(\Delta _{t_2}\) gives
Since \(J^\psi _n\) is conserved, the integral of \(\nu ^\psi _n\) on B vanishes, so that,
To compute the integral we change coordinates and we pass to a system of lightlike and polar coordinates \(u,v, \theta , \phi \) where \(r, \theta ,\phi \) are standard polar spherical coordinates in \(\Sigma _{n,t_1}\) with center given by the center of \(\Delta _1\) and \(u:= t+r\), \(v:= t-r\) so that u is a lightlike future increasing coordinate along L. With these coordinates,
and, writing J for \(J_n^\psi \),
Now, observe that, since \(J^\psi _n\) is past directed (if it does not vanish), we must have \(2J^t = J^u+ J^v \le 0\). The condition that \(J^\psi _n\) is zero or causal reads
where h is the Euclidean metric on \(\Sigma _{n,t}\) and \(\vec {J}\) the spatial part of \(J_n^\psi \). In summary, \(J^uJ^v \ge 0\) and \(J^u+J^v \le 0\), so that \(J^v,J^u \le 0\). Since \(\theta \in [0,\pi ]\) in (58) and \(v=0\) on L, we conclude that
Up to now we have established that
To conclude the proof, it is sufficient to observe what follows in the case \(\Delta _1\) is a finite union of finite-radius open balls \(\Delta ^{(j)}_{1}\), \(j=1,\ldots , N\). We can always assume that no ball of the family is a subset of another ball of the family. Since N is finite, the region of \(\partial J^+(\Delta _1)\) between \(t_1\) and \(t_2\) is a piecewise smooth lightlike submanifold and we can apply the above reasoning by changing coordinates for every cone of the family. The integral over the surface \(\partial J^+(\Delta _1)\) between \(t_1\) and \(t_2\) is a finite sum of contributions of type (59) where each integral is now performed on a smaller portion of each conical surface. However, each contribution is nonnegative because the integrated function is nonnegative. \(\square \)
Remark 32
Even if it is not strictly necessary for our final goal, I prove that, if restricting to a suitable dense subspace of \({{{\mathcal {S}}}}(\mathcal{H})\), the inequality in (56) can be made sharp. I consider a subspace \({{{\mathcal {D}}}}({{{\mathcal {H}}}}) \subset {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) of vectors \(\psi \in {{\mathcal {H}}}\) such that there is \(n\in {{\textsf{T}}}_+\) and a Minkowski coordinate system co-moving with n such that \({{\mathbb {R}}}^3 \ni \vec {p} \mapsto \psi (E_n(p), \vec {p}_n) \in {{\mathscr {D}}}({{\mathbb {R}}}^3)\) (the test-function space on \({{\mathbb {R}}}^3\)) when represented in the spatial coordinates on \({{\mathbb {R}}}^3\). The definition of \({{{\mathcal {D}}}}({{{\mathcal {H}}}})\) does not depend of the choice of n and co-moving Minkowskian coordinates as \({{{\mathcal {D}}}}({{{\mathcal {H}}}})\) is invariant under the representation U of \(IO(1,3)_+\) in (7). Finally, \(\mathcal{D}({{{\mathcal {H}}}}) \subset {{{\mathcal {S}}}}({{{\mathcal {D}}}})\) is dense in \({{{\mathcal {H}}}}\). The proof of these elementary facts is analogous to the one of \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and it is left to the reader.
Relying on the well-posedness of the Characteristic Cauchy problem on Lorentzian cones, the following precise result is valid.
Proposition 33
With the hypotheses of Lemma 31, if \(\psi \in \mathcal{D}({{{\mathcal {H}}}})\) with \(||\psi ||=1\), then inequality (56) holds in the sharpest form
Proof
See “Appendix A”. \(\square \)
\(\blacksquare \)
I come back to the main stream of the reasoning with a second lemma.
Lemma 34
Consider the spatial localization observable \({{\textsf{A}}}\). Take \(n\in {{\textsf{T}}}_+\) and \(t_1,t_2 \in {{\mathbb {R}}}\) with \(t_2 \ne t_1\). Let \(\Delta _1 \subset \Sigma _{n,t_1}\) be an non-empty open set (respectively, a compact set), and let \(\Delta _2:= (J^+(\Delta _1) \cup J^-(\Delta _1))\cap \Sigma _{n,t_2}\) be the corresponding open (resp. compact) set in \(\Sigma _{n,t_2}\). Then
is valid for every \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) with \(||\psi ||=1\).
Proof
We always assume \(t_2> t_1\), since the other case has a similar proof. First of all, we already know that if \(\Delta _1\) is open then \(\Delta _2\) is open as well. The case of \(\Delta _1\) compact is a subcase of a known fact valid in globally hyperbolic spacetimes (like \({{\mathbb {M}}}\)): if K is compact, the intersection of \(J^+(K)\) and a spacelike Cauchy surface (like \(\Sigma _{n,t_2}\)) is compact as well.
Let us first examine the case of \(\Delta _1\subset \Sigma _{n,t_1}\) open. According to Theorem 1.26 in [12], for every \(\delta >0\), there exist a countable collection \(\{\Gamma _j\}_{j=1,2,\ldots }\) of disjoint (non-empty) closed balls \(\Gamma _j \subset \Delta _1\) with diameter less than \(\delta \), such that
where we remind the reader that \(\textrm{d}\Sigma _{n,1}\) is the Lebesgue measure when written in the spatial Minkowskian coordinates comoving with n. Evidently we can assume that the balls are open (and their closures are disjoint) since \(\partial \Gamma _j\) has zero Lebesgue measure. Let us define \(\Delta _1':= \bigcup _{j\in {{\mathbb {N}}}} \Gamma _j\) and \(\Delta '_2:= \Sigma _{n,t_2} \cap J^+(\Delta '_1)\). Since the probability measure defined by \({{\textsf{A}}}_{n,t_1}\) and \(\psi \) is per definition absolutely continuous with respect to the Lebesgue measure, (63) yields \(\langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta '_1)\psi \rangle = \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1)\psi \rangle \in [0,+\infty ]\). Furthermore, since \(\Delta '_1 \subset \Delta _1\), it must be \(\Delta _2' \subset \Delta _2\) and thus \(\langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta '_2)\psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2)\psi \rangle \). In summary, to prove the thesis, it is sufficient to establish that \(\langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta '_1)\psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta '_2)\psi \rangle \). Let us define \(\Delta _1^N:= \cup _{j=1}^N \Gamma _j\) and \(\Delta _2^N:= J^{+}(\Delta _1^N) \cap \Sigma _{n,t_2}\). By additivity and taking Lemma 31 into account,
Notice that the limit of the right-most side exists because the sequence is non-decreasing as \(\Delta _2^N \subset \Delta _2^{N+1} \subset \Delta _2'\) by construction.
Let us pass to prove the thesis for \(\Delta _1\) compact. Since \(\Sigma _{n,t_1}\) is a metric space and \(\Delta _1\) compact, it is not difficult to construct a sequence of open sets \(A_1 \supset A_2 \supset \cdots \supset \Delta _1\) such that
Each \(A_j\) is the union of a finite (but arbitrarily large) number of balls centered on some points of \(\Delta _1\) with radius less than \(\delta _j \rightarrow 0^+\). As a consequence
The inclusion \(\subset \) immediately arises from the definitions, the other inclusion is less trivial. Let us prove it. If e belongs to the right-hand side of the identity above and, as said, \(A_j\) is the finite union of balls of radius \(\delta _j>0\) centered on some points of \(\Delta _1\), we have thatFootnote 10\(\text{ dist }(e, J^+(\Delta _1) \cap \Sigma _{n,t_2}) < \delta _j\) for every \(\delta _j \rightarrow 0^+\). As a consequence e is an accumulation point of \(\Delta _2 = (J^+(\Delta _1) \cap \Sigma _{n,t_2}) \cap \Sigma _{n,t_2}\) which is compact, thus closed (the space being Hausdorff). Hence \(e \in \Delta _2\). Finally, taking advantage of the already proved result on open sets and internal continuity
\(\square \)
I am now in a position to prove the main result of this section, that every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution (according to (a) in Definition 15) for every spatial localization probability measure constructed out of the Terno POVM \({{\textsf{A}}}\) and every pure state \(\psi \in {{{\mathcal {H}}}}\).
Theorem 35
Consider the spatial localization observable \({{\textsf{A}}}\). Take \(n\in {{\textsf{T}}}_+\) and \(t_1,t_2 \in {{\mathbb {R}}}\). Let \(\Delta _1 \subset \Sigma _{n,t_1}\) be a Lebesgue set and let \(\Delta _2:= (J^+(\Delta _1) \cup J^-(\Delta _1))\cap \Sigma _{n,t_2}\) be the corresponding set in \(\Sigma _{n,t_2}\). Then
In other words, every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution according to (a) in Definition 15 for the family of spatial localization probability measures \(\mu ^\psi (\cdot ):= \langle \psi | {{\textsf{A}}}(\cdot ) \psi \rangle \).
Proof
First of all, notice that \(\mu ^{\psi }_{n,t}(\cdot ):= \langle \psi | {{\textsf{A}}}_{n,t}(\cdot ) \psi \rangle \), for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is necessarily regular when restricted to \({\mathscr {B}}(\Sigma _{n,t})\), since \(\Sigma _{n,t}\) is countable union of compacts with finite measure (Theorem 2.18 in [33]). As a consequence the completion \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{n,t})}}\) of \(\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}\) is regular as well (Prop. 1.59 in [10]). The \(\sigma \)-algebra of the regular complete measure \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}}\) includes the Lebesgue \(\sigma \)-algebra in particular, and the completion \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}}\) restricted to \({\mathscr {L}}(\Sigma _{n,t})\) coincides to \(\mu ^{\psi }_{n,t}\) itself. This can be seen as follows. The \(\sigma \)-algebra of a completion \({\overline{\mu }}\)—where \(\mu : {\mathscr {S}}(X) \rightarrow [0,+\infty ]\) is a positive \(\sigma \)-additive measure—can be constructed as the family of sets \(E\cup Z\) where \(E\in {\mathscr {S}}(X)\) and \(Z\subset F \in {\mathscr {S}}(X)\) with \(\mu (F)=0\). Obviously \({\overline{\mu }}(E\cup Z):= \mu (E)\). From these properties we can write, \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}}(G)=\mu ^{\psi }_{n,t}(G)\) if \(G\subset {\mathscr {L}}(\Sigma _{n,t})\) since \(G= E\cup Z\) where \(E\in {\mathscr {B}}(\Sigma _{n,t})\) and \(Z \subset F \in {\mathscr {B}}(\Sigma _{n,t})\) such that F has zero Lebesgue measure and thus \(\mu ^{\psi }_{n,t}(F)=0\) because \(\mu ^{\psi }_{n,t}\) is absolutely continuous with respect to the Lebesgue measure. We conclude that \(\mu ^{\psi }_{n,t}\) is regular on the Lebesgue \(\sigma \)-algebra because it is the restriction of a regular measure. In particular, it is inner regular. So, if \(\Delta _1\) is Lebesgue-measurable, for \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\) we can take advantage of Lemma 34 proving that
where we have also used the fact that \(J^+(K) \cap \Sigma _{n,2} \subset J^+(\Delta _1) \cap \Sigma _{n, 2} = \Delta _2\).
The thesis is therefore true if \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\) with \(||\psi ||=1\). Evidently the last requirement can be dropped by bi-linearity of the scalar product. Since \({{{\mathcal {S}}}}(\mathcal{H})\) is dense in \({{{\mathcal {H}}}}\) and the scalar product is continuous, the result extends to the whole Hilbert space and the proof is over. \(\square \)
Corollary 36
There is no state \(\psi \in {{{\mathcal {H}}}}\) that satisfies the hypotheses of the Hegerfeldt theorem (Theorem 19) for any family of bounded balls in the rest space of any arbitrarily fixed \(n\in {{\textsf{T}}}_+ \).
Proof
The thesis of Hegerfeldt’s theorem is incompatible with the result of the previous theorem. \(\square \)
7 Subtleties with the notion of position and Castrigiano’s causality requirement
There is a crucial feature of the notion of spatial position by Terno: it uses a four current of probability that, in spite of being a four-vector, depends on the reference frame n as it is evident in (53) when \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). That is an unavoidable fact since the notion of energy-momentum current has the same type of dependence: \(J^\nu _n = n^\mu {T_{\mu }}^\nu \). This feature leads to a more articulated picture where one can define the probability to find a particle in \(\Delta \subset \Sigma _{n',t'}\) still referring to the current associated to \(n\ne n'\)! That is permitted because
in view of Proposition 30, when \(n'\in {{\textsf{T}}}_+\). In fact \(J^{\psi \mu }_{n}(x)\) is causal and past directed or vanishes producing the inequality above just because \(n'\) is timelike and future directed. So that, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), one can define a spatial localization probability
The divergence theorem, exploiting the fact that \( J^{\psi \mu }_{n}(x)\) rapidly vanishes at spatial infinity and that \(\partial _\mu J^{\psi \mu }_{n}(x)= \partial _\mu n^\nu T^{\psi \mu }_{\nu }(x)_n=0\), assures the correct normalization
Physically speaking, \(\mu ^{\psi ,n}_{n',t'}(\Delta )\) accounts for the probability to find a particle in \(\Delta \subset \Sigma _{n',t'}\) using detectors which are at rest in n but synchronized with \(n'\). There is no reason why this probability should coincide with \(\mu ^\psi _{n',t'}(\Delta )=\langle \psi |{{\textsf{A}}}_{n',t'}(\Delta )\psi \rangle \) as the corresponding energy densities do not. This result opens a new perspective on the notion of spatial localization which deserves to be investigated.
Mathematically speaking, all that can be encapsulated into a new family of POVMs depending on both n and \(n'\) (and \(t'\)).
Theorem 37
If \(n,n' \in {{\textsf{T}}}_+\) and \(t'\in {{\mathbb {R}}}\), there is only one POVM with effects \({{\textsf{M}}}^n_{n',t}(\Delta ) \in {{\mathfrak {B}}}({{{\mathcal {H}}}}) \) for \(\Delta \in {\mathscr {L}}(\Sigma _{n',t'})\) such that
Furthermore the following holds.
-
(1)
It has the form, in terms of the Newton–Wigner POVM \({{\textsf{Q}}}_{n',t'}\) on \(\Sigma _{n',t'}\),
$$\begin{aligned}{} & {} {{\textsf{M}}}^n_{n',t'}(\Delta ) =\frac{1}{2}\left( \sqrt{\frac{H_{n'}}{H_{n}}}{{\textsf{Q}}}_{n',t'}(\Delta ) \sqrt{\frac{H_{n}}{H_{n'}}} + \sqrt{\frac{H_{n}}{H_{n'}}} {{\textsf{Q}}}_{n',t'}(\Delta ) \sqrt{\frac{H_{n'}}{H_{n}}}\right) \nonumber \\{} & {} -\frac{n\cdot n}{2} \sqrt{\frac{H_{n'}}{H_{n}}}\left( \eta ^{\mu \nu }\frac{P_{n\mu }}{H_{n'}} {{\textsf{Q}}}_{n',t'}(\Delta ) \frac{P_{n\nu }}{H_{n'}} + \frac{m}{H_{n'}} {{\textsf{Q}}}_{n',t'}(\Delta ) \frac{m}{H_{n'}} \right) \sqrt{\frac{H_{n'}}{H_{n}}}\,. \end{aligned}$$(67)(where the various everywhere-defined bounded composite operators \(H_n/H_{n'}\), etc., are defined in terms of the joint spectral measure of \(P^\mu \) and standard spectral calculus).
-
(2)
It reduces to the Terno POVM for \(n=n'\):
$$\begin{aligned} {{\textsf{M}}}^n_{n,t}(\Delta ) = {{\textsf{A}}}_{n,t}(\Delta )\,, \text{ if }~n\in {{\textsf{T}}}_+, t\in {{\mathbb {R}}}~\text{ and }~ \Delta \in {\mathscr {L}}(\Sigma _{n,t}). \end{aligned}$$(68) -
(3)
The \(IO(1,3)_+\) covariance relations are valid,
$$\begin{aligned} U_{h} {{\textsf{M}}}^n_{n',t'}(\Delta ) U_{h}^{-1} = {{\textsf{M}}}^{\Lambda _h n}_{\Lambda _h n', t'_h}(h\Delta ) \,, \quad \forall \Delta \in {\mathscr {L}}(\Sigma _{n',t'})\,, \quad \forall h \in IO(1,3)_+\,. \nonumber \\ \end{aligned}$$(69)
Proof
(Initial statement and (1)). Let us call F the operator defined by the right-hand side of (67). It is evidently everywhere defined and bounded on \({{{\mathcal {H}}}}\). By polarization and density of \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\), it is completely determined by the values \(\langle \psi |F \psi \rangle \) when \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). Let us prove that it satisfies (66). Per direct inspection we have that, if \(\psi \in {{{\mathcal {S}}}}({{\mathcal {H}}})\), taking (51) and (52) into account, the right-hand side of (66) can be written, with \(-n'\cdot x= t'\)
which, in turn, coincides with \(\langle \psi |F \psi \rangle \) when taking (17) into account, as wanted. Notice that (66) implies that the everywhere defined extended operator \({{\textsf{M}}}^n_{n',t'}(\Delta )\) is positive as it is the continuous extension of a positive operator. The family of these operators, with \(n, n',t'\) fixed, is also weakly \(\sigma \)-additive in \(\Delta \) because \({{\textsf{Q}}}_{n',t'}\) in the right-hand side of (67) is weakly \(\sigma \)-additive, and the operators appearing as factors are bounded and everywhere defined. As the family \({{\textsf{M}}}^n_{n',t'}(\Delta )\), with \(\Delta \) variable in \({\mathscr {L}}(\Sigma _{n',t'})\), is made of positive operators with \({{\textsf{M}}}^n_{n',t'}(\Sigma _{n',t'})=I\) (direct inspection), we conclude that the said family (with n fixed) is a (normalized) POVM on \({\mathscr {L}}(\Sigma _{n',t'})\).
(2) It is obvious from (67) and (38).
(3) The proof immediately arises from the analogous covariance properties of \({{\textsf{Q}}}_{n,t}\) and the basic covariance properties of \({{\textsf{H}}}_n\) and composite (bounded everywhere defined) operators \(H_n/N_{n'}\), \(m/H_{n}\), \(P_{n'}^\mu /H_n\). \(\square \)
Remark 38
For a given \(n_0 \in {{\textsf{T}}}_+\), the physical meaning of the family of POVMs
is the notion of spatial position observable, referred to all reference frames \(n\in {{\textsf{T}}}_+\) and every global time \(t\in {{\mathbb {R}}}\) of each such reference frame, when the used class of detectors is always co-moving with \(n_0\).
To conclude this work, I prove that for every given \(n_0 \in {{\textsf{T}}}_+\), the family of POVMs \({{\textsf{M}}}_{n_0}\) satisfies Castrigiano’s causality condition.
Theorem 39
For given \(n_0\in {{\textsf{T}}}_+\) and \(\psi \in {{{\mathcal {H}}}}\), define the family of probability measures \(\mu ^{\psi , n_0}_{n,t}\)
That family satisfies Castrigiano’s causality condition (b) in Definition 15.
where \(\Delta ':= \left( J^+(\Delta ) \cup J^-(\Delta ) \right) \cap \Sigma _{n',t'}\).
In particular, the time evolution associated to every n is causal according to (a) Definition 15.
Sketch of proof. Condition (a) in Definition 15 is satisfied if condition (b) holds, so that it suffices to prove the validity of the latter. The proof of Theorem 35 and its preparatory lemmata can be performed also for the considered case since the only relevant two facts, for \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\), are that (i) the values of \(\mu ^{\psi , n_0}_{n,t}(\Delta )\) and \(\mu ^{\psi , n_0}_{n',t'}(\Delta ')\)—where for the moment \(n=n'=n_0\)—are spatial boundary integrals of the conserved four current \(J^\psi _{n_0}\) and (ii) that \(J^\psi _{n_0}\) is either zero or causal and past directed. These facts are valid also dropping the requirement \(n=n'=n_0\). It does not matter if the normal vectors n and \(n'\) to the two hyperplanes containing, respectively, \(\Delta \) and \(\Delta '\) are both parallel to the vector \(n_0\) defining \(J^\psi _{n_0}\) or not, so we can definitely drop the requirement \(n=n'=n_0\). Indeed, in proving Theorem 35 the bases of the four-dimensional solid used to integrate the current were orthogonal to \(n_0\) just as a contingent fact, due to the very definition of the measures \(\mu ^\psi _{n,t}\) which is now relaxed. The only case where the above proof has to be slightly changed is when the possible intersection of \(\Sigma _{n,t}\) and \(\Sigma _{n',t'}\) passes through \(\Delta \). In that case it is convenient to treat separately the two parts of \(\Delta \). \(\Box \)
8 Discussion
In this work, I rigorously proved that, when referring to the only issue (I1) of the Introduction, a spatial notion of localization for a massive Klein Gordon particle is possible without problems with causality (with some caveat, however, see below), avoiding the pathologies predicted by Hegerfeldt’s theorem in particular. As is well known from long time, this latter obstruction prevents, in particular, the existence of spatially localized states. The crucial mathematical notion is here the covariant family of POVMs \({{\textsf{A}}}\) proposed by Terno [36] which has been analyzed with a broad mathematical detail, focusing on its interplay with the popular Newton–Wigner notion of spatial localization. This analysis showed that the notion of localization based on the POVM \({{\textsf{A}}}\) and the associated first moment in particular, keep many good properties of the Newton–Wigner localization notion while they drop many problematic issues. To what extent this notion is compatible with the interplay of causality and post-measurement state (I2) was not the object of this work and it will be investigated elsewhere. Terno’s notion seems in good agreement with Castrigiano’s notion of causal evolution ((a) Definition 15). The validity of the very Castrigiano causality condition ((b) Definition 15) needs more care and a different, perhaps physically more subtle, analysis than the case of causal systems rigorously treated by Castrigiano [8]. Terno’s notion of spatial localization relies upon the notion of energy density and not upon the notion of density of charge. The former is associated to a conserved tensor field, the stress energy tensor \(T_{\mu \nu }\), instead of a vector field. As a matter of fact, the relevant probability density in the reference frame n is the normalized energy density \(T_{\mu \nu } n^\mu n^\nu \). This choice as the apparent drawback that probability densities of different reference frames result to be incomparable, just because the densities \(T_{\mu \nu } n^\mu n^\nu \) and \(T_{\mu \nu } n'^\mu n'^\nu \) are not connected by the standard argument based on the conservation law \(\partial _\mu T^{\mu \nu }=0\) and the Stokes–Poincaré theorem. That law permits to compare different boundary terms where only one normal vector is changed instead of one pair at a time: \(n,n \rightarrow n',n'\). To test Castrigiano’s causality condition seems to be impossible along that way. However, the physical interpretation turns out to be of some help at this juncture. The twice presence of n can be relaxed to a single occurrence of a pair of different timelike future-oriented unit vectors, \(n,n'\). The fact that the density \(T^{\mu \nu } n_\mu n_\nu '\) is still positive suggests a new and different operational interpretation of the notion of spatial position. To assert that the particle stays in \(\Delta \subset \Sigma _{n',t'}\) one should not only specify the reference frame \(n'\) and the instant of time \(t'\), but one should also make explicit our choice of the rest frame n of the employed detectors (which actually are energy detectors). The relevant density therefore is \(J_{n}^\mu n'_\mu \ge 0\), where \(J_{n}^\mu := n^\nu T_{\nu }^\mu \). This picture includes the apparently most natural choice is \(n= n'\), but one is also allowed to pick out \(n\ne n'\). Keeping fixed n and varying \(n'\) produces a new family of POVMs \({{\textsf{M}}}^n_{n',t'}\) when one varies \(n'\) and \(t'\). This family satisfies both requirements (a) and (b) in Definition 15, in particular, Castrigiano’s causality condition (b). It is not clear to the author if this approach is really physically meaningful and the subject certainly deserves further investigation and discussion.
Actually something can be said about the causal relation of \({{\textsf{A}}}_{n,t}(\Delta _1)\) and \({{\textsf{A}}}_{n',t'}(\Delta _2)\), where \(\Delta _2 = (J^+(\Delta _1) \cup J^-(\Delta _1)) \cap \Sigma _{n',t'}\) and \(n\ne n'\), on the ground of a pure mathematical observation. However, it is not clear if this reasoning may lead to a proof of Castrigano’s causality condition, especially because there is no evident physical reason behind the following argument. If one assumes that \(\psi \in {{{\mathcal {D}}}}({{{\mathcal {H}}}})\), and that \(\Delta _1\subset \Sigma _{n,t_1}\) has the special form as in Proposition 33, then the sharp inequality (61) is valid. Therefore, for continuity reasons, keeping fixed \(\psi \), n and \(t=t_1\) on the left-hand side of (61), that inequality must be still valid if one slightly changes \(n'=n\) and \(t'= t_2\), and \(\Delta _2\) accordingly. If the neighborhood of values \((n',t')\) around (n, t) where this inequality holds were the entire \({{\textsf{T}}}_+\times {{\mathbb {R}}}\), one could use an improvement of the argument already exploited in the main text to pass from the special type of set \(\Delta _1\) to a generic element of \({\mathscr {L}}(\Sigma _{n,t})\), possibly relaxing < to \(\le \). The usual density argument of \({{{\mathcal {D}}}}({{{\mathcal {H}}}})\) in \({{{\mathcal {H}}}}\) would conclude the proof. However, I do not think that the said neighborhood of (n, t) covers the full set of possibilities of the choice of \((n',t')\). All that will be investigated elsewhere.
Notes
It is easy to prove that the result does not depend on the choice of o.
which eventually can be proved to be unique on account of Wightman uniqueness theorem above mentioned [38].
According to Sect. 2.3 the vector \(\varphi _\psi (t,\cdot )\) can be viewed in terms of a representative given by a Lebesgue measurable or a Borel measurable function and one interprets “a.e.” accordingly.
I am grateful to Prof. Castrigiano for clarifications on these issues.
Referring to general, quite realistic, measurement instruments, the post-measurement state is not pure even if the initial state is.
That theorem includes the hypothesis “\(\langle \psi | {{\textsf{E}}}_\Delta \psi \rangle =1\) and \(\langle \varphi |{{\textsf{E}}}_\Delta \varphi \rangle =0\) implies \(\langle \psi |\varphi \rangle =0\) ”. However, it is not necessary since it is automatically satisfied by every POVM \({{\textsf{E}}}\).
A similar equation appears as Eq. (A18) in Terno’s paper [36].
One cannot take advantage of the vector field version of the theorem because the portion L of the boundary has a degenerated induced metric.
This is valid if \(\Sigma _{n,t_1}\) and \(\Sigma _{n',t_2}\) are parallel as it is since we are assuming \(n=n'\). However, a similar argument is valid if \(n\ne n'\), finding \(\text{ dist }(e, J^+(\Delta _1) \cap \Sigma _{n',t_2}) <\epsilon \delta _j\) for some \(\epsilon >0\) independent of j.
The singular regions of the set \(\partial J^+(\Delta ^{(1)}_1 \cup \Delta ^{(2)})\) where the set ceases to be an embedded submanifold are reached by continuity of \(\Phi ^\psi _n\).
References
Beck, C.: Localization Local Quantum Measurement and Relativity. Dissertation an der Fakultät für Mathematik, Informatik und Statistik der Ludwig-Maximilian-Universität München (2020)
Barat, N., Kimball, J.C.: Localization and Causality for a free particle. Phys. Lett. A 308, 110 (2003)
Bialynicki-Birula, I., Bialynicka-Birula, Z.: Heisenberg uncertainty relations for photons. Phys. Rev. A 86, 022118 (2012)
Bostelmann, H., Fewster, C.J., Ruep, M.H.: Impossible measurements require impossible apparatus. Phys. Rev. D 103, 025017 (2021)
Busch, P.: Unsharp localization and causality in relativistic quantum theory. J. Phys. A: Math. Gen. 32(37), 6535 (1999)
Busch, P., Lahti, P., Pellonpää, J.-P., Ylinen, K.: Quantum Measuremement. Springer, Berlin (2016)
Carmeli, C., Cassinelli, G., De Vito, E., Toigo, A., Vacchini, B.: A complete characterization of phase space measurements. J. Phys. A: Math. Gen. 37, 5057–5066 (2004)
Castrigiano, D.P.L.: Dirac and Weyl Fermions—The Only Causal Systems (2017). arXiv:1711.06556
Castrigiano, D.P.L., Leiseifer, A.D.: Causal localizations in relativistic quantum mechanics. J. Math. Phys. 56, 072301 (2015)
Cohn, D.: Measure Theory. Birkhäuser, Basel (1980)
Drago, N., Moretti, V.: The notion of observable and the moment problem for \(*\)-algebras and their GNS representations. Lett. Math. Phys. 110(7), 1711–1758 (2020)
Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions, Revised CRC Press, Boca Raton (2015)
Fewster, C.J., Verch, R.: Quantum fields and local measurements. Commun. Math. Phys. 378, 851–889 (2020)
Fewster, C.L., Jubb, I., Ruep, M.H.: Asymptotic measurement schemes for every observable of a quantum field theory. Ann. Henri Poincaré 24, 4 (2022)
Farkas, S., Kurucz, Z., Weiner, M.: Poincaré covariance of relativistic quantum position. Int. J. Theor. Phys. 41, 1 (2002)
Friedlander, F.G.: The Wave Equation on a Curved Space-Time. Cambridge University Press, Cambridge (1976)
Gerlach, B., Gromes, D., Petzold, J.: Konstruktion definiter Ausdrücke für die Teilchendichte des Klein–Gordon–Feldes. Z. Phys. 204(1), 1–11 (1967)
Gerlach, B., Gromes, D., Petzold, J., Rosenthal, P.: Über kausales Verhalten nichtlokaler Grössen und Teilchenstruktur in der Feldtheorie. Z. Phys. 208, 381–389 (1968)
Halvorson, H., Clifton, R.: No place for particles in relativistic quantum theories? In: Lyre, M.H., Wayne, A. (eds.) Ontological Aspects of Quantum Field Theory. World Scientific, Kuhlmann (2002)
Hegerfeldt, G.C.: Remark on causality and particle localization. Phys. Rev. D 10, 3320 (1974)
Hegerfeldt, G.C.: Violation of causality in relativistic quantum theory? Phys. Rev. Lett. 54, 2395 (1985)
Hellwig, K.E., Kraus, K.: Formal description of measurements in local quantum field theory. Phys. Rev. D 1, 566 (1970)
Henning, J.J., Wolf, W.: Positive definite densities for the positive frequency solutions of the Klein–Gordon equation with arbitrary mass. Z. Phys. 242, 12–20 (1971)
Hörmander, L.: A remark on the characteristic Cauchy problem. J. Funct. Anal. 93, 270–277 (1990)
Jancewicz, B.: Operator density current and relativistic localization problem. J. Math. Phys. 18, 2487 (1977)
Kazemi, M.J., Hashamipour, H., Barati, M.H.: Probability density of relativistic spinless particles. Phys. Rev. A 98, 012125 (2018)
Malament, D.B.: In defense of dogma: why there cannot be a relativistic quantum mechanics of (localizable) particles. In: Clifton, R. (ed.) Perspectives on Quantum Reality. Kluwer Acodemic Publishers, Amsterdam (1996)
Moretti, V.: Spectral Theory and Quantum Mechanics, 2nd revised and enlarged edition. Springer, Berlin (2017)
Moretti, V.: Fundamental Mathematical Structures of Quantum Theory. Springer, Berlin (2019)
Murata, M.: Anti-locality of certain functions of the Laplace operator. J. Math. Soc. Jpn. 25(4), 556–564 (1973)
Newton, T.D., Wigner, E.P.: Localized states for elementary systems. Rev. Mod. Phys. 21, 400–406 (1949)
Ozawa, M.: Quantum measuring processes of continuous observables. J. Math. Phys. 25, 79 (1984)
Rudin, W.: Real and Complex Analysis, 3d edn. McGraw-Hill, New York (1986)
Ruijsenaars, S.N.M.: On Newton–Wigner localization and superluminal propagation speeds. Ann. Phys. 137, 33–43 (1981)
Segal, E., Goodman, R.W.: Anti-locality of certain Lorentz-invariant operators. J. Math. Mech. 14(4), 629–638 (1965)
Terno, D.R.: Localization of relativistic particles and uncertainty relations. Phys. Rev. A 89, 042111 (2014)
Weidlich, W., Mitra, A.K.: Some remarks on the position operator in irreducible representations of the Lorentz-group. Nuovo Cim. 30, 385–389 (1963)
Wightman, A.S.: On the localizability of quantum mechanical systems. Rev. Mod. Phys. 34, 845–872 (1962)
Acknowledgements
I am very grateful to D.P.L.Castrigiano for various remarks, suggestions, and discussions about several issues appearing in this paper. I thank S.Delladio, N.Drago, C.Fewster, F.Finster, S.Mazzucchi, P.Meda, and M.Sanchéz for helpful discussions. I am finally grateful to a referee for very helpful comments a suggestions of various nature, including further relevant references. This work has been written within the activities of INdAM-GNFM
Funding
Open access funding provided by Università degli Studi di Trento within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Proof of some propositions
Appendix A: Proof of some propositions
Proof of Proposition 3
The first two statements are evident per direct inspection. The density property arises from the fact that the Schwartz space \({\mathscr {S}}({{\mathbb {R}}}^3)\) is dense in \(L^2({{\mathbb {R}}}^3, d^3p)\). Therefore, if \(\psi \in {{{\mathcal {H}}}}\), there is a sequence \({\mathscr {S}}({{\mathbb {R}}}^3) \ni \psi _n\) with
However, \(\psi ':=\sqrt{ E_n} \psi \in {\mathscr {S}}({{\mathbb {R}}}^3)\) as well, and \(\int _{{{\mathbb {R}}}^3} \left| \psi (\vec {p}_n) - \psi '_n(\vec {p}_n)\right| ^2 \frac{d^3p}{E_n(\vec {p}_n)} \rightarrow 0\). The sequence of \(\psi '_n\) belongs to \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) by definition and converges to \(\psi \) in the topology of \({{{\mathcal {H}}}}\) so that the thesis is true. \(\square \)
Proof of Proposition 4
The dense subspace \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) stays in the domains of the considered operators, it is invariant and thereon the operators are symmetric. The multiplicative action of the one-parameter groups generated by the said four operators leaves \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) invariant, as it arises per direct inspection. As a consequence of a known corollary of the Stone theorem (see, e.g., Corollary 7.26 in [29]), the thesis follows. \(\square \)
Proof of Proposition 13
First observe that \( N^0_{n,t}\) is nothing but tI so that (1) and (2) are trivial for it. Assuming \(t=0\), let us focus again on the unitary map (19)
Per direct inspection one sees that \(P'_{n \alpha }:=S_n P_{n \alpha } S_n^{-1} \) is still a multiplicative operator \(\vec {p}_{nk} \cdot \) (for \(k = 1,2,3\)) in \( L^2({{\mathbb {R}}}^3, d^3p)\). Similarly, from (20), \({ N'}_{n,0}^{k}:=S_n N^k_{n 0} S_n^{-1}\) is the (self-adjoint) multiplicative operator \(x^k\cdot \) in \(L^2({{\mathbb {R}}}^3,d^3x)\), where \(L^2({{\mathbb {R}}}^3,d^3x)\) and \(L^2({{\mathbb {R}}}^3, d^3p)\) are connected to each other by the Fourier-Plancherel unitary transform. Hence, these sets of operators are exactly the non-relativistic ones in \(L^2({{\mathbb {R}}}^3,d^3p)\) and \(L^2({{\mathbb {R}}}^3, d^3x)\). As a consequence, (1), (2), and (3) are valid because they are valid for the non-relativistic operators if replacing \(\mathcal{S}({{{\mathcal {H}}}})\) for \({\mathscr {S}}({{\mathbb {R}}}^3) = S_n({{{\mathcal {S}}}}({{{\mathcal {H}}}}))\) (e.g., see [29]) and the considered properties are invariant under unitary maps. If we switch on \(t\ne 0\), since \( N^\alpha _{n,t} = U^{(n)-1}_t N^\alpha _{n,t} U^{(n)}_t\) and \(P_{n \alpha } = U^{(n)-1}_t P_{n \alpha } U^{(n)}_t\) as a consequence of the analogs for the corresponding spectral measures, the found properties are still valid because the evolutor \(U_t^{(n)}\) is unitary and leaves \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) invariant. Let us pass to the proof of (5). From (18), \(D( N_{n,t}^\alpha ) \supset {{{\mathcal {S}}}}(\mathcal{H})\), and the definition (25), we have
where \(\psi '\in {{{\mathcal {H}}}}\) and \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). The last integral equals
which implies the thesis due to arbitrariness of \(\psi ' \in \mathcal{H}\). Only (4), i.e., the pair of identities in (46), remain to be proved for \(\alpha =k=1,2,3\). The first identity \(U^{(n)\dagger }_t N_{n,0}^kU^{(n)}_t\psi = N_{n,t}^k\psi \) for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) immediately arises from (30). Let us pass to the second identity in (46). Define \(f(\vec {p}_n):= (S_n\psi )(\vec {p}_n)\) where \(S_n\) is the unitary map (19). The operators \(P_{nk}\) and \(H_n\) act on the functions \(f=f(\vec {p}_n)\) multiplicatively, respectively, with \(p_k\) and \(\sqrt{\vec {p}^2+m^2}\), whereas \( N^k_{n,0}\) is represented by \(i\frac{\partial }{\partial p_k}\); finally, \(U^{(n)}_t\) is the multiplicative operator with \(e^{-it \sqrt{\vec {p}^2+m^2}}\). As a consequence, for \(\psi ,\psi ' \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) (writing \(\vec {p}\) in place of \(\vec {p}_n\))
where \(f= S_n(\psi ) \in {\mathscr {S}}({{\mathbb {R}}}^3)\) and \(f'= S_n(\psi ') \in {\mathscr {S}}({{\mathbb {R}}}^3)\). Using the fact that f and \(f'\) are Schwartz, the t-derivative of the integral above can be computed by passing the derivative under the sign of integral (by a straightforward use of Lebesgue’s dominated convergence theorem) finding
As the final result does not depend on time, we can argue that
Namely,
Since \(\psi '\in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) which is dense, the found result implies the thesis. \(\square \)
Proof of Corollary 14
We shall write \(P_k\) in place of \(P_{nk}\) and H in place of \(H_n\) for shortness. As \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) which is invariant under \(P_k\) and H, no domain issues take place in the following. Due to (29), the thesis is equivalent to
To prove it, observe that \(H^{-1} P_k\) is well defined and symmetric on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\), hence
so that, since \((H^{-1} P_k) (H^{-1}P_k)\psi = H^{-2}P^2_k\psi \) for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\),
As a consequence,
Since \(m^2\langle \psi |H^{-2} \psi \rangle = m^2|| H^{-1}\psi ||^2> 0\) (\(H^{-1}\psi =0\) is not possible if \(\psi \ne 0\) because, as \(H^{-1}: {{{\mathcal {H}}}}= Ran(H) \rightarrow D(H)\), it would imply \(0=HH^{-1}\psi = \psi \)), the inequality above implies the thesis. \(\square \)
Proof of Eq. (40)
From (36) and the definition of \(\psi _j\),
up to a non-vanishing multiplicative constant, coincides with
where \({\hat{\chi }}\) is a Schwartz function on \({{\mathbb {R}}}^3\) and f is the Fourier transform (up to a constant factor) of the characteristic function of \(B_R\),
Since
we have
Using the fact that the last factor, \(\cos u\), and \(u^{-1} \sin u\) are bounded, we have that, for some \(C\ge 0\),
where we have used Hölder’s inequality in the last passage. As a matter of fact, since the Lebesgue measure is translationally invariant, there is \(K\ge 0\) such that, uniformly in j,
The integrand is j-uniformly bounded by the integrable function \(\frac{K'}{E_n(p)^4}\) for some constant \(K'\ge 0\) and the integrand vanishes pointwise as \(j\rightarrow +\infty \) as \({\hat{\chi }} \in {\mathscr {S}}({{\mathbb {R}}}^3)\). Lebesgue’s dominated convergence theorem implies that \(I_j \rightarrow 0\) as \(j\rightarrow +\infty \). \(\square \)
Proof of Proposition 33. We start where the proof of Lemma 31 ends, with the further hypothesis that \(\psi \in {{{\mathcal {D}}}}({{{\mathcal {H}}}})\). We first consider the case of \(\Delta _1\) made of a single ball. Since \(-J^v\ge 0\) is continuous, the integral in (57) vanishes if and only if \(J^v=0\) everywhere on L. This is the only possibility for having \(\langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle = \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \). Let us prove that \(J^v=0\) everywhere in L is not permitted and this fact will conclude the proof. Let us assume that \(J^v=0\) on L so that \(J_n^\psi \) vanishes or is lightlike on L because \(-J^uJ^v + h(\vec {J}, \vec {J}) \le 0\) and the only remaining component is \(J^u\). From Proposition 30 we know that \(\Phi ^\psi _n(x)=0\) if \(x\in L\). Making explicit the form of \(\Phi ^\psi _n\) on L, in terms our coordinate system, we have that
where
and \(\theta _p,\phi _p\) are the polar angles of \(\vec {p}_n\). Passing to lightlike coordinates and noticing that L is described by \(v=0\), we have, in particular, that it must be
where \(a<b\) are determined by \(t_2-t_1\) and the radius of \(\Delta _1\). Since \(\psi \) is continuous with compact support (here the condition \(\psi \in {{{\mathcal {D}}}}({{{\mathcal {H}}}})\) is used), by a standard argument based on the Cauchy-Riemann identities and the Lebesgue dominated convergence theorem it is easy to prove that the function in the right-hand side can be analytically extended to complex values of u in the whole complex plane. As this function vanishes in the real segment [a, b], it must vanish everywhere in \(u \in [0,+\infty )\).
We observe for future convenience that the same argument can be used to prove that the integral is an analytic function in the variables \(\theta \) and \(\phi \) and that if the function vanishes in an open interval in the domain of \(\theta \) or in an analogous open interval in the domain of \(\phi \), then it must vanish for all the permitted values of these variables, respectively, \(\theta \in [0,\pi ]\) and \(\phi \in [-\pi ,\pi ]\). To assert that \(\Phi ^\psi _n=0\) on the whole conical surface described by \(u\in [0,+\infty )\), \(\theta \in [0,\pi ]\), \(\phi \in [-\pi ,\pi ]\) it is therefore sufficient that \(\Phi ^\psi _n=0\) on an open set on that conical surface.
The conclusion is that the smooth solution \(\Phi ^\psi _n\) of the massive Klein–Gordon equation in \({{\mathbb {M}}}\) vanishes on the whole conical surface defined by prolonging \(\partial J^+(D_1)\) for times \(<t_1\) up to the tip of the cone. As is known [16, 24], the characteristic Cauchy problem (also known as the Goursat problem) is well-posed inside a Lorentzian cone and thus the only possible solution inside the volume of the cone is \(\Phi ^\psi _n=0\). In other words our wavefunction, defined in the whole \({{\mathbb {M}}}\) must vanishes in the volume of the cone. In particular, \(\Phi _n^\psi (t_1,\cdot )\) and \(i\partial _t \Phi _n^\psi (t_1,\cdot ) = (\overline{-\Delta +m^2I})^{1/2} \Phi _n^\psi (t_1,\cdot )=0\) in the open ball \(\Delta _1\). Theorem 11 implies that it vanishes on the whole \(\Sigma _{n,t_1}\). Inverting (52), we have \(\psi =0\) that is not possible since \(||\psi ||=1\) by hypothesis. The hypothesis \(J^v=0\) everywhere on L is untenable and this fact removes the possibility of having \(=\) in (60) proving the thesis for the considered case.
Let us pass to consider the case of \(\Delta _1 = \Delta ^{(1)}_1 \cup \Delta ^{(2)}_2\) with the two sets being a pair of non-empty finite-radius open balls. We can always assume that each ball does not include the other but they can have non-empty intersection. We have
where \(L_{12}\) is the part of \(\partial J^+(\Delta ^{(1)}_1 \cup \Delta ^{(2)})\) which stays between the parallel planes \(\Sigma _{n,t_1}\) and \(\Sigma _{n,t_2}\). As before the integral is non-negative because we can apply the previous argument to each portion of conical surface forming \(L_{12}\) and, respectively, generated by \(\Delta _1^{(1)}\) and \(\Delta _2^{(2)}\), taking advantage of two different polar coordinate systems. However, the fact that the integral is strictly positive needs a little more care. As before, on account of Proposition 30, the value of integral is zero if and only if \(\Phi ^\psi _n\) everywhereFootnote 11 vanishes on \(L_{12}\). We can focus attention on the complete conical surface \(\Gamma _1\) which completes \(\partial J^+(\Delta ^{(1)}_{1})\) in its past till the tip, centering a system of polar coordinates on its center. It is clear that the intersection of \(\Gamma _1 \cap L_{12}\) includes an open set (in the relative topology of \(\Gamma _1\)) where \(\Phi ^\psi _n\) vanishes because it vanishes on the whole \(L_{12}\). Using the analyticity argument exploited above, we conclude that \(\Phi ^\psi _n\) vanishes on the whole \(\Gamma _1\), so that it also vanishes in the interior of the cone in view of the characteristic Cauchy problem as before, and finally \(\Phi ^\psi _n=0\) everywhere in \({{\mathbb {M}}}\) due to Theorem 11 reaching a contradiction \(\psi =0\). Hence the right-hand side of (70) is strictly positive and the proof for the examined case is over.
To conclude the proof, it is sufficient to observe what follows in the case \(\Delta _1\) is a finite union of distinct finite-radius open balls \(\Delta ^{(j)}_{1}\), \(j=1,\ldots , N\). We can always assume that no ball of the family is a subset of another ball of the family. Since N is finite, the region of \(\partial J^+(\Delta _1)\) between \(t_1\) and \(t_2\) necessarily includes an open portion of some \(\partial J^+(\Delta ^{(j)}_1)\). Working in the conical completion \(\Gamma _j\) of \(\partial J^+(\Delta ^{(j)}_1)\), we can use the above argument achieving the thesis.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Moretti, V. On the relativistic spatial localization for massive real scalar Klein–Gordon quantum particles. Lett Math Phys 113, 66 (2023). https://doi.org/10.1007/s11005-023-01689-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11005-023-01689-5
Keywords
- Relativistic quantum localization
- Newton-wigner localization
- Hegerfeldt theorem
- Positive-operator-valued measures
- klein-gordon equation
- Stress-energy tensor