1 Introduction

A long-standing puzzling issue of theoretical and mathematical physics concerns the notion of spatial localization of a relativistic particle at given time. The problem is difficult because of a number of no-go results popped out over the years, after the seminal work of Newton and Wigner [31]. These theoretical snags establish that apparently natural proposals to define a spatial observable of a relativistic free particle (for a given Minkowski reference frame at a certain time) are actually forbidden by general requirements concerning causal locality and positivity of the energy. The first victim of these no-go results is the very Newton–Wigner localization notion.

My opinion is that this issue has been quite overlooked in spite of being urgent: after all, experimental physicists can assert, with a certain approximation, where a relativistic particle has been detected at a certain time in laboratories. What theoretical notion describes these kinds of claims by our colleagues?

The notion of position observable not only is perfectly defined in the non-relativistic regime, but it plays a very central role in the theoretical construction of the corpus itself of the quantum theory. The notion of position is involved in the first version of the canonical commutation relations and the theoretical explanation of the Heisenberg principle. How is it possible that a so crucial theoretical notion simply fades out when we pass to the relativistic regime?

The situation is very delicate from the physical perspective. First of all, we know that, trying to localize a particle under its Compton length, gives rise to a pair of particles, so that a sharp localization seems not possible. In this sense the detectors should play an active role [1]. However, that is a physical fact which is predicted by interacting QFT. It is not clear how such an obstruction should take place in an elementary (perhaps naive) mathematical description that disregard the effects of Quantum Field Theory.

In author’s view, however, the intricate nature of the problem is also due to a frequent confusion in the literature concerning two entangled, but actually physically distinct issues.

  1. (I1)

    On the one hand, one can focus on the properties and theoretical assumptions on the probability of spatial localization, without paying attention to the post-measurement state. In that case, the major obstructions against apparently natural definitions of localization observables arise from a class of theoretical results cumulatively called Hegerfeldt’s theorem [20, 21] and their more advanced re-formulations [8, 9]. They at least prove that no sharp localization is possible if the generator of time evolution is bounded below. Sharp localization would imply non-local features of the (time evolution of the) position probability distributions: a superluminal spread of the probability distribution [34]. These no-go results concern any general description of the spatial localization observable (at a given time) in terms of positive-operator valued measures (POVMs) and not only projection-valued measures (PVMs). Castrigiano [8] formulated a precise causality condition ((b) in Definition 15) that every physically acceptable POVM (or PVM)—which describes spatial localization—should satisfy independently of the issue of the post-measurement state.

  2. (I2)

    On the other hand, one can (also) focus on the post-measurement state arising after a position measurement. In that case, a list of no-go results has been accumulated over the years starting from the so-called Malament theorem [27]. It, in particular, establishes that localization cannot be described in terms of PVMs—i.e., not even in terms of self-adjoint operators. It happens when (a) the post-measurement state is produced by a projective measurement, (b) the PVM satisfies natural requirements of locality (according to Hellwig–Kraus’ analysis [22]), and (c) the generator of time evolution is positive or bounded below. Reinforcing the hypotheses of Malament statement, the no-go result can be extended to localization observables in terms of POVMs as established first by Busch [5] and by Halvorson and Clifton later [19], when a suitable post-measurement procedure has been chosen (essentially an ideal Lüders measurement).

However, there is no automatic way to pass from (I1) to (I2), especially when the position observable is described in terms of a POVM. There are infinitely many measurement schemes (based on completely positive maps) which give rise to the same PVM or POVM while the post-measurement states are completely different. This arbitrariness was already noticed by von Neumann in his seminal book on the mathematical foundations of Quantum Mechanics and it is a fundamental tool in the modern theory of quantum measurement [6]. The fact that the values of a position observable are continuous is a further source of problems. Continuity of outcomes rules out all naive state-updating procedures on account of the crucial Ozawa theorem [32]. The standard projective Lüders scheme is physically untenable in this case, even if it is always described as the prototype of all state updating processes in many textbooks of quantum mechanics.

Referring to (I2), it seems to me that these no-go results against every notion of spatial localization observable always rely on a precise choice of the description of the post-measurement state in terms of Kraus operators (e.g., they are the square root of the effects of the POVM). In my opinion, this choice appears to oscillate between being too naive or too arbitrary. Therefore, some apparently definite claims, relying upon the issue (I2), about the non-existence of any spatial localization observable [19] do not seem really motivated up to now. Even if they impose some severe constraint on the measurement scheme, the last word has not been said in my view.

Both issues (I1) and (I2) rule out, in particular, the already cited Newton–Wigner position observable [31] of a quantum relativistic particle.

The Newton–Wigner position observable is described in terms of a family of PVMs \({{\textsf{Q}}}_{n,t}={{\textsf{Q}}}_{n,t}(\Delta )\)—where \(\Delta \) ranges in the measurable sets of the rest 3-space \(\Sigma _{n,t}\) of every given Minkowski reference frame n at every given time t. This family of PVMs is covariant with respect to the Poincaré group. It is worth stressing that covariance with respect to spatial Euclidean subgroup (and some further technical hypotheses) uniquely determine the family of \({{\textsf{Q}}}_{n,t}\) as a consequence of Mackay imprimitivity theory as proved by Wightman [38]. This is one of the theoretical motivations which make the NW position observable quite appealing.

In view of the spectral theorem, the information of the family of PVMs \({{\textsf{Q}}}_{n,t}\) is completely encapsulated in the assignment of a set of self-adjoint operators, the Newton–Wigner position operators

$$\begin{aligned} N_{n,t}^\alpha := \int _{\Sigma _{n,t}} x^\alpha \textrm{d}{{\textsf{Q}}}_{n,t}(x)\quad \alpha = 0,1,2,3\,, \end{aligned}$$

where \(x^0=t\) and the Minkowski coordinates (self-adjoint operators) \(N_{n,t}^1,N_{n,t}^2,N_{n,t}^3\) of a particle are co-moving with n. Obviously \(N^0_{n,t}=tI\).

To make more intricate the issue, the Newton–Wigner position self-adjoint operator \(N^\mu _{n,t}\) possesses quite natural and appealing properties in spite of the fact that the associated PVM violates basic local-causality principles. In particular (see Sect. 3), explicitly referring to the case of a scalar massive particle:

  1. (i)

    natural covariance properties with respect to the Lorentz (and Poincaré) group take place (on a suitable domain):

    $$\begin{aligned} U_\Lambda N_{n,t}^\alpha U_\Lambda ^{-1} = (\Lambda ^{-1})^\alpha _\beta N^{\beta }_{\Lambda n, t_{\Lambda }}\,, \quad \forall \Lambda \in O(1,3)_+\,; \end{aligned}$$
  2. (ii)

    a quite natural relativistic version of Ehrenfest’s theorem is valid for \(k=1,2,3\):

    $$\begin{aligned} U^{(n)\dagger }_t N_{n,0}^k U^{(n)}_t = N_{n,t}^k = N_{n,0}^k + t\frac{P_{nk}}{P_{n0}}\,; \end{aligned}$$
  3. (iii)

    the worldline determined by the expectation values \(\langle \psi |N^\mu _{t,n} \psi \rangle \) is timelike as is expected by a massive particle:

    $$\begin{aligned} \sum _{k=1}^3 \left( \frac{\textrm{d}}{\textrm{d}t} \langle \psi | N^{k}_{n,t} \psi \rangle \right) ^2 < 1 \,; \end{aligned}$$
  4. (iv)

    Heisenberg’s commutation relations are satisfied on a suitable dense invariant domain (a core)

    $$\begin{aligned} {[}N_{n,t}^k, N_{n,t}^h]=[ P_{n h}, P_{n k}]=0\,, \qquad [ N_{n,t}^k, P_{n h}]= i\hbar \delta ^k_hI \,, \end{aligned}$$
  5. (v)

    this, in particular, produces the standard statement of the Heisenberg principle;

  6. (vi)

    when the energy content of a state vector \(\psi \) is small if compared with the \(mc^2\) of the particle, then the \(N^k_{n,0}\psi \) tends to become \(X^k\psi \), where \(X^k\) is the non-relativistic position operator.

This paper is devoted to address the issue (I1) for a scalar Klein–Gordon particle with mass \(m>0\). To this end, a recent proposal of (non-commutative) POVM localization observable \({{\textsf{A}}}_{n,t}(\Delta )\) will be considered for massive spin-0 particles. This proposal was due to Terno [36]. This notion of localization, contrarily to the Newton–Wigner notion of localization does not admit sharply localized states (Proposition 25), so that it is not in automatic conflict with the Hegerfeldt theorem. However, it admits states which resemble localized states with arbitrarily fine approximation (Proposition 25). An idea of proof that the spatial decay of the Terno probabilities does not trigger the Hegerfeldt’s superluminal phenomena appears in [36]. We shall rigorously prove this fact as a byproduct of the achievement (B) below.

We shall show (Theorem 22) that the POVM \({{\textsf{A}}}_{n,t}(\Delta )\) is actually a kinematic deformation of the PVM \({{\textsf{Q}}}_{n,t}(\Delta )\) in terms of the components of the four-momentum \(P_n^\mu \) in the used Minkowski reference frame n:

$$\begin{aligned} {{\textsf{A}}}_{t,n}(\Delta ) = {{\textsf{Q}}}_{t,n}(\Delta ) + \frac{1}{2}\left( \frac{P_{n\mu }}{P_{n0}} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{P_{n}^\mu }{P_{n0}} + \frac{m}{P_{n0}} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{m}{P_{n0}} \right) \,. \end{aligned}$$

This relation implies, in particular, that the family of POVMs \({{\textsf{A}}}_{n,t}\) satisfies a covariance property with respect to the Poincaré group analogous to the one satisfied by \({{\textsf{Q}}}_{t,n}\).

Three main results are next achieved in this paper by expanding and making mathematically rigorous some definitions and results discussed in [36] and referring to some ideas introduced in [8].

  1. (A)

    Theorem 26 proves that, in spite of the difference of the two POVMs, the first-moment operator \(X^\alpha _{n,t}\) of Terno’s POVM coincides with the Newton–Wigner position operator. Therefore, \(X^\alpha _{n,t}\) preserves all good properties (i)–(vi) of that operator listed above but (v). In fact, a corrected version of the Heisenberg inequality will be established

    $$\begin{aligned} \Delta _\psi X^k_{n,t} \Delta _\psi P_{nk} \ge \frac{\hbar }{2} \sqrt{1 + 2\Delta _\psi P_{n,k}^2 \left\langle \psi \left| \frac{(P_{n0})^2-(P_{nk})^2}{(P_{n0})^{4}}\psi \right. \right\rangle }\,. \end{aligned}$$

    It evidently reproduces the standard inequality for large values of the mass.

  2. (B)

    Theorem 35 proves that Terno’s notion of spatial localization satisfies a consequence of the causality requirement introduced by Castrigiano [8] as conjectured by Terno [36]. The validity of this condition rules out, in particular, the obstruction represented by the Hegerfeldt’s theorem. This pair of achievements promote \({{\textsf{A}}}_{n,t}(\Delta )\) to be a very good candidate for the relativistic notion of spatial localization of a massive scalar particle from the viewpoint of the issue (I1) at least.

  3. (C)

    The validity of the complete Castrigiano causality requirement is finally established (Theorem 39). However, this result needs an improved version of the family of POVMs \({{\textsf{A}}}\) and a delicate discussion about the physical nature of spatial localization.

In the recent years, several interesting problems related to the issue (I2) and local causality have been fruitfully addressed in the setting of algebraic quantum field theory by Fewster, Verch and collaborators [4, 13, 14] in a given curved (globally hyperbolic) spacetime. These papers complete and largely extend the fundamental analysis by Hellwig and Kraus [22]. In that case, the relevant notion of localization refers to spacetime regions and to generic local observables in the Haag-Kastler setting. This paper instead deals with single particles (not quantum fields) and the localization refers to the space of a reference frame at a given time in Minkowski spacetime. It is clear that this is an ideal description which perhaps will reveal unphysical eventually, since realistic measurements take a finite lapse of time necessarily. However, up to now, this type of ideality does not seem a source of the above-mentioned obstructions to the definition of a physically meaningful notion of spatial localization. On the other hand, it seems remarkable the fact that the Terno notion of spatial localization is actually a byproduct of QFT, at least from a heuristic perspective: it arises from the normally-ordered stress-energy tensor operator whose nature is intrinsically part of basic constructions of QFT.

This paper is organized as follows. Section 2 contains a quick technical recap on the massive Klein–Gordon field in Minkowski spacetime, stressing, in particular, on the covariance properties with respect to the relevant Poincaré unitary representation. Section 3 introduces the Newton–Wigner notion of spatial localization according to Wightman viewpoint. Section 4 illustrates some well-known problems with the NW notion of localization also presenting general Castrigiano’s causality requirement and the notion of causal time evolution, proving that this notion of localization is ruled out by the Hegerfeldt theorem. Section 5 introduces the notion of spatial localization presented by Terno into a rigorous setting and establishes some important properties of it. Section 6 proves that this notion of spatial localization is in agreement with Castrigiano’s notion of causal time evolution. Section 7 focuses on the causality condition proposed by Castrigiano by introducing a second family of POVMs depending on a pair of reference frames. The final section is devoted to a discussion on the achieved results and possible developments.

2 Minkowski spacetime and Klein–Gordon massive particles

2.1 Minkowski spacetime

In the rest of the paper, the Minkowski spacetime \({{\mathbb {M}}}\) is described as a four-dimensional real affine space—whose vector space of translations is denoted by \({\textsf{V}}\)—endowed with a Lorentzian metric g in \({\textsf{V}}\) with signature \(-,+,+,+\). A basis \(\{v_0,v_1, v_2,v_3\}\in {\textsf{V}}\) is said to be pseudo-orthonormal if \(g(v_\mu ,v_\nu ) = \eta _{\mu \nu }\), where \([\eta _{\mu \nu }]= diag(-1,1,1,1)\).

Causal vectors \(v\in {\textsf{V}}\) satisfy per definition \(g(v,v) \le 0\) and \(v\ne 0\). Causal vectors with \(g(v,v)=0\) are said null or lightlike. They are timelike if \(g(v,v) <0\). Finally, spacelike vectors satisfy \(g(v,v)>0\).

\(({{\mathbb {M}}},g)\) is time-oriented, i.e., we choose a preferred half \({\textsf{V}}_+\) of the open cone of the timelike vectors, \(g(v,v)<0\). The (causal!) vectors in \(\overline{V_+}\setminus \{0\}\) are said future-directed. \({{\textsf{T}}}_+:=\{ v\in {\textsf{V}}_+ \,|\, g(v,v) = -1\}\) is the set of unit future-directed timelike vectors. The remaining half of the open cone \({\textsf{V}}\) of timelike vectors is denoted by \({\textsf{V}}_-\). The past-directed causal vectors are the elements of \(\overline{{\textsf{V}}_-}\setminus \{0\}\). The past-directed timelike and lightlike vectors are analogously the elements of \({\textsf{V}}_-\) and \(\partial {\textsf{V}}_-\setminus \{0\}\), respectively.

\(J^+(S) \subset {{\mathbb {M}}}\) denotes the causal future of \(S\subset {{\mathbb {M}}}\). It is the set of events \(e\in {{\mathbb {M}}}\) such that there is some \(e'\in S\) such that \(e-e' \in \overline{{\textsf{V}}_+}\). An analogous definition is valid for the causal past \(J^-(S)\) of S. Notice that \(S\subset J^{\pm }(S)\), \(A\subset B\) implies \(J^\pm (A) \subset J^\pm (B)\), and \(J^\pm \left( \bigcup _{\alpha \in A}S_\alpha \right) = \bigcup _{\alpha \in A}J^\pm (S_\alpha )\).

Remark 1

Throughout \(v\cdot u:= g(v,u)\) when \(u,v\in {\textsf{V}}\). The light speed is \(c=1\) and the Planck constant satisfies \(\hbar = 1\) unless I will specify otherwise. \(\blacksquare \)

2.2 Poincaré group, reference frames, and all that

I adopt the conventions of [15] regarding the interpretation of the relevant groups of transformations in \({{\mathbb {M}}}\). The orthochronous Lorentz group \(O(1,3)_+\) is the group of linear maps \(\Lambda : {\textsf{V}}\rightarrow {\textsf{V}}\) which both preserve the metric g and \({\textsf{V}}_+\). The orthochronous Poincaré group \(IO(1,3)_+\) is the group of affine maps \({{\mathbb {M}}}\rightarrow {{\mathbb {M}}}\) whose associated linear map belongs to \(O(1,3)_+\).

If \(A\subset {{\mathbb {M}}}\) and \(h\in O(1,3)_+\), then \(hA:= \{h(e)\,|\, e \in A\}.\)

Every \(n\in {{\textsf{T}}}_+\) defines a corresponding (Minkowskian) reference frame in \({{\mathbb {M}}}\). The three-dimensional rest spaces of the reference frame n are the three-planes (pseudo-ortho) normal to n. To label them, one chooses a preferred point \(o \in {{\mathbb {M}}}\) called origin. (Everything is discussed in this paper does not depend on this choice.) A rest space of \(n \in {{\textsf{T}}}_+\) is therefore denoted by \(\Sigma _{n,t}\), where \(t\in {{\mathbb {R}}}\) indicates the signed distance (the proper time of n) of \(\Sigma _{n,t}\) from o:

$$\begin{aligned} \Sigma _{n,t}:= \{e\in {{\mathbb {M}}}\,|\, -(e-o)\cdot n = t\}\,. \end{aligned}$$
(1)

With a choice of the origin \(o\in {{\mathbb {M}}}\), the orthochronous Poincaré group \(IO(1,3)_+\) is isomorphic to the semidirect product of \(O(1,3)_+\) and \({\textsf{V}}\) itself and acts as follows

$$\begin{aligned} (\Lambda , a): {{\mathbb {M}}}\ni e \mapsto o+ a+ \Lambda (e-o) \in {{\mathbb {M}}}\quad \text{ for } (\Lambda ,a) \in O(1,3)_+\times {\textsf{V}}\,. \end{aligned}$$
(2)

By construction, if \(h:=(\Lambda _h,a_h) \in O(1,3)_+\),

$$\begin{aligned} h \Sigma _{n,t} = \Sigma _{\Lambda _h n, t_{h}}\quad \text{ where }~t_{h}:= -(he -o) \cdot \Lambda _h n~\text{ for } \text{ every }\, e\in \Sigma _{n,t}. \end{aligned}$$
(3)

Notice that it turns out that \(t_{h} = t- a \cdot \Lambda _h n\) does not depend on the choice of \(e\in \Sigma _{n,t}\).

The Euclidean group \({{{\mathcal {E}}}}_n\) of \(\Sigma _{n,t}\), i.e., the group of \(h_{n,t}\)-isometries, coincides with the subgroup of IO(1, 3) of elements \((\Lambda ,a)\), which preserve n:

$$\begin{aligned} {{{\mathcal {E}}}}_n:= \{ h\in O(1,3)_+ \,|\, \Lambda _h n= n \}\,. \end{aligned}$$
(4)

With the choice of an origin o, \({{\mathbb {M}}}\) is identified to \({\textsf{V}}\) by means of the bijective map \(M \ni e \mapsto e-o \in {\textsf{V}}\). The choice of a basis \(\{v_1,\ldots , v_4\}\subset {\textsf{V}}\) defines a (global) Cartesian coordinate system of origin o given by \({{\mathbb {M}}}\ni e \mapsto (x^1(e),\ldots , x^4(e)) \in {{\mathbb {R}}}^4\) where \(e= o+ \sum _{\alpha =1}^4 x^\alpha (e)v_\alpha \). That system of Cartesian coordinates is said to be Minkowskian if the basis is pseudo-orthonormal. A Minkowskian coordinate system, with coordinates \(x^0=t,x^1,x^2,x^3\), is co-moving with \(n\in O(1,3)_+\) if \(\frac{\partial }{\partial x^0}=n\). Evidently \(x^1,x^2,x^3\) define (global) Cartesian orthonormal coordinates on each \(\Sigma _{n,t}\) referring to the Euclidean metric \(h_{n,t}\) induced on it by g.

\({\mathscr {B}}(\Sigma _{n,t})\) will denote the family of Borel subsets on \(\Sigma _{n,t}\). Independently of the choice of the coordinates, \(h_{n,t}\) induces a positive regular Borel measure \(\textrm{d}\Sigma _{n,t}\) on \(\Sigma _{n,t}\). In the above coordinates \(x^1,x^2,x^3\), that measure is the restriction \(d^3x=\textrm{d}x^1\textrm{d}x^2\textrm{d}x^3\) of the Lebesgue measure on \({{\mathbb {R}}}^3\) to the Borel sets. The completion of \(d^3x\) is the Lebesgue measure itself as a consequence. The corresponding completion of \(\textrm{d}\Sigma _{n,t}\) will be named Lebesgue measure on \(\Sigma _{n,t}\). I will make use of the same symbol \(\textrm{d}\Sigma _{n,t}\) for a measure and its completion as the difference will be clear from the choice of the used \(\sigma \)-algebra. The Lebesgue \(\sigma \)-algebra on \(\Sigma _{n,t}\) will be denoted by \({\mathscr {L}}(\Sigma _{n,t})\).

2.3 Completion of measures and \(L^2\) spaces

A positive \(\sigma \)-additive measure \(\mu : \Sigma (X) \rightarrow [0,+\infty ]\) and its completion \({\overline{\mu }}: \overline{\Sigma (X)} \rightarrow [0,+\infty ]\) give rise to the same Hilbert space \(L^2(X, \mu )\) since (see, e.g., Proposition 1.57 [28]), for every \(\overline{\Sigma (X)} \)-measurable function f, there is a \(\Sigma (X)\)-measurable function g such that \(f=g\) is true \(\mu \)-almost everywhere and either \(\int _X f \textrm{d}{\overline{\mu }}= \int _X g \textrm{d}\mu \) or both the integrals do not exist. The identity evidently extends to \(L^2\)-scalar products of pairs of corresponding functions. The map \(L^2(\mu ) \ni [f]_\mu \mapsto [f]_{{\overline{\mu }}} \in L^2({\overline{\mu }})\) is a Hilbert space isomorphism.

2.4 Hilbert space and Poincaré group representation for the massive Klein–Gordon particle

In the rest of this work, I will take advantage of the Einstein convention of summation over repeated Greek indices, from 0 to 3.

Let us consider a Klein–Gordon real particle of mass \(m>0\) described by the \(C^\infty \) scalar field \(\varphi : {{\mathbb {M}}}\rightarrow {{\mathbb {R}}}\) satisfying the normally hyperbolic Klein–Gordon equation

$$\begin{aligned} \Box \varphi - m^2 \varphi =0\,, \quad \text{ where } \Box :=\eta ^{\mu \nu } \partial _\mu \partial _\nu \text{ in } \text{ every } \text{ Minkowski } \text{ coordinate } \text{ system }\,. \end{aligned}$$

As is well-known, the quantization of that system, viewed as the restriction to the one-particle space of the second quantization procedure, relies on the Hilbert space of pure state vectors

$$\begin{aligned} {{\mathcal {H}}}:= L^2({\textsf{V}}_{m,+}, \mu _m)\,. \end{aligned}$$

Above, if \({\textsf{V}}_{m,+}:= \{p\in {\textsf{V}}\,|\, g(p,p) = -m^2\,, \,p\in {\textsf{V}}_+\} \) denotes the mass shell of (positive energy) four-momenta of mass m, the Hilbert space inner product reads

$$\begin{aligned} \langle \psi | \psi '\rangle := \int _{{\textsf{V}}_{m,+}} \overline{\psi (p)}\psi '(p) \textrm{d}\mu _m(p) \,. \end{aligned}$$
(5)

Above, \(\mu _m(p)\) is the Lorentz-invariant (positive Borel regular) measure which takes the form

$$\begin{aligned} \textrm{d}\mu _m(p) = \frac{d^3p}{E_n(p)} \,, \quad E_n(p):= -n\cdot p \end{aligned}$$
(6)

in every Minkowskian reference frame co-moving with \(n\in {{\textsf{T}}}_+\), \(d^3p= \textrm{d}p^1\textrm{d}p^2\textrm{d}p^3\) being the standard Lebesgue measure on \({{\mathbb {R}}}^3\) identified with the rest spaces of n by means of any Minkowskian coordinate system co-moving with n (that measure is independent of the chosen Minkowskian coordinate frame co-moving with n). Notice that

$$\begin{aligned} E_n(p) = \sqrt{\vec {p}_n^2 + m^2} =p^0\,, \quad \vec {p}_n:= p + (n \cdot p)n \equiv (p^1,p^2,p^3) \end{aligned}$$

are, respectively, the n-temporal component and n-spatial component of the four-momentum p, respectively, corresponding to \(p^0\) and the triple \((p^1,p^2,p^3)\) in any Minkowski coordinate system co-moving with n. As \(E_n(p)\) depends only on \(\vec {p}_n\), I will occasionally write \(E_p(\vec {p}_n)\) in place of \(E_n(p)\).

As usual, the (normal pure) quantum states of the particle are represented by the unit vectors \(\psi \in {{{\mathcal {H}}}}\) up to phases.

The inner product (5) is invariant under the strongly-continuous unitary (active) action induced byFootnote 1 (2) of the orthochronous Poincaré group \(IO(1,3)_+\):

$$\begin{aligned} (U_{(\Lambda , a)}\psi )(p):= e^{-i p\cdot a} \psi (\Lambda ^{-1}p) \quad \text{ if } \psi \in {{{\mathcal {H}}}} \text{ and } (\Lambda , a)\in IO(1,3)_+ \end{aligned}$$
(7)

This invariance property arises from the \(O(1,3)_+\) invariance of \(\mu _m\):

$$\begin{aligned} \mu _m(\Lambda E) = \mu _m(E) \quad \text{ for } \text{ every } \text{ Borel } \text{ set } E \text{ in } {\textsf{V}}_{m,+}.\,. \end{aligned}$$
(8)

The action of time translations subgroup along the time direction \(n\in {{\textsf{T}}}_+\) reads

$$\begin{aligned} U_{(I, \tau n)}\psi (p) = e^{i \tau E_n(p)}\psi (p)\,, \end{aligned}$$

so that the self-adjoint generator of the one-parameter group, the multiplicative operator

$$\begin{aligned}{} & {} (P_{n 0}\psi )(p):= -(H_n\psi )(p):= -E_n(p)\psi (p)\nonumber \\{} & {} D(H_n):= \left\{ \psi \in L^2({\textsf{V}}_{m+}, \textrm{d}\mu _m)\,\left| \, \int _{{\textsf{V}}_{m,+}} E_n(p)^2 |\psi (p)|^2 \textrm{d}\mu _m <+\infty \right. \right\} \end{aligned}$$
(9)

has negative spectrum since \(\sigma (H_n) = \sigma _c(H_n) = [m,+\infty )\). In this formalism, the time evolutor in n is

$$\begin{aligned} U^{(n)}_\tau := U_{(I, -\tau n)} = e^{-i\tau H_n}\,. \end{aligned}$$
(10)

\(H_n\) is the Hamiltonian operator in the reference frame \(n\in {{\textsf{T}}}_+\). The self-adjoint generators of the spatial translations

$$\begin{aligned} U_{(I, a v_k)}\psi (p) = e^{-i a p_k}\psi (p)\,, \end{aligned}$$

in n along the spatial unit vectors \(v_k\) of a co-moving Minkowskian coordinate system are therefore the multiplicative operators

$$\begin{aligned}{} & {} P_{n k}:= p_k \cdot = \vec {p}_k \cdot ,\quad k=1,2,3.\nonumber \\{} & {} D(P_{n k}):= \left\{ \psi \in L^2({\textsf{V}}_{m+}, \textrm{d}\mu _m)\,\left| \, \int _{{\textsf{V}}_{m,+}} (\vec {p}_n)_k^2 |\psi (p)|^2 \textrm{d}\mu _m <+\infty \right. \right\} . \end{aligned}$$
(11)

Evidently, \(\sigma (P_{nk})= \sigma _c(P_{nk})= {{\mathbb {R}}}\) for \(k=1,2,3\).

The operators \((P_{n0}, P_{n1}, P_{n2}, P_{n3})\) define the (covariant) components of the four-momentum in n with respect to the relevant Minkowskian coordinate system co-moving with n. No specification of time t is necessary because \(P_{n \alpha }\) is trivially a constant of motion.

Definition 2

We say that \(\psi \in {{\mathcal {H}}}\) is of Schwartz type if there is \(n\in {{\textsf{T}}}_+\) and a Minkowski coordinate system co-moving with n such that \({{\mathbb {R}}}^3 \ni \vec {p} \mapsto \psi (E_n(p), \vec {p}_n)\in {{\mathbb {C}}}\) stays in \({{\mathscr {S}}}({{\mathbb {R}}}^3)\) (the Schwartz space on \({{\mathbb {R}}}^3\)) when represented in the spatial coordinates on \({{\mathbb {R}}}^3\). The \({{\mathcal {H}}}\) subspace of vectors of Schwartz type will be denoted by \(\mathcal{S}({{{\mathcal {H}}}})\).

Proposition 3

The definition of \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) does not depend of the choice of n and co-moving Minkowskian coordinates. That is equivalent to saying the \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is invariant under the representation U of \(IO(1,3)_+\) in (7). Finally, \(\mathcal{S}({{{\mathcal {H}}}})\) is dense in \({{{\mathcal {H}}}}\).

Proof

See “Appendix A” \(\square \)

Proposition 4

\({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is invariant under the components of the four-momentum \(P_{n\alpha }\), \(\alpha =0,1,2,3\), referred to a reference frame \(n\in {{\textsf{T}}}_+\). Furthermore, \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is a core for each those symmetric operators (i.e., each of them is essentially self-adjoint thereon).

Proof

See “Appendix A” \(\square \)

If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), the associated covariant wavefunction (the name is justified by (14) below) is

$$\begin{aligned} \varphi _\psi (x):= \int _{{\textsf{V}}_{m,+}} \frac{\psi (p)}{(2\pi )^{3/2}} e^{i p\cdot x} \textrm{d}\mu _m(p)\,, \end{aligned}$$
(12)

where \(x(e) = e-o\in {\textsf{V}}\) is the vector representation of the events in \({{\mathbb {M}}}\) with respect to the origin o.

Proposition 5

If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), the associated wavefunction \(\varphi _\psi \) satisfies the following.

  1. (1)

    \(\varphi _\psi \in C^\infty (M; {{\mathbb {C}}})\) and \(\varphi _\psi (t,\cdot ) \in {\mathscr {S}}({{\mathbb {R}}}^3)\) for every \(t\in {{\mathbb {R}}}\), where \({{\mathbb {R}}}^3 \equiv \Sigma _{n,t}\) through the choice of a Minkowskian coordinate system co-moving with any chosen \(n\in {{\textsf{T}}}_+\).

  2. (2)

    The Klein–Gordon equation is valid, \(\Box \varphi _\psi - m^2 \varphi _\psi =0\,.\)

  3. (3)

    If also \(\psi ' \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), then

    $$\begin{aligned} \langle \psi |\psi ' \rangle = \frac{i}{2}\int _{\Sigma _{n,t}} \left( \overline{\varphi _\psi } \partial _n \varphi _{\psi '} - \overline{\varphi _{\psi '}} \partial _n \varphi _\psi \right) \textrm{d}\Sigma _{n,t} \end{aligned}$$
    (13)

    where the right-hand side does not depend on the choice of both \(n\in {{\textsf{T}}}_+\) and \(t\in {{\mathbb {R}}}\) since the left-hand side does not.

  4. (4)

    The action (7) of \(IO(1,3)_+\) induces the standard active action on scalar fields in \({{\mathbb {M}}}\),

    $$\begin{aligned} \varphi _{U_{(\Lambda , a)}\psi }(x) = \varphi _\psi \left( \Lambda ^{-1}(x-a)\right) \,. \end{aligned}$$
    (14)

Finally, \({{{\mathcal {H}}}}\) coincides with the completion of \(\mathcal{S}({{{\mathcal {H}}}})\) equipped with the inner product provided by the right-hand side of (13).

I leave the proof of these very well-known facts to the reader. They are based on elementary results of the theory of Fourier(-Plancherel) transform. The last statement immediately arises from (13) and the last statement of Proposition 3.

3 The Newton–Wigner observable for the massive Klein–Gordon particle

3.1 The Newton–Wigner PVM

I assume that the reader is well acquainted with basic notions of spectral theory and the notion of Projection Valued Measure (PVM) (see, e.g., [28, 29]).

Consider a (separable) Hilbert space \({{{\mathcal {H}}}}\) that defines the pure states of a quantum particle, not necessarily Klein–Gordon nor relativistic, but possibly equipped with spin and other internal observables. According to Wightman [38],

Definition 6

A Newton–Wigner PVM [31, 38] for a particle described in the (complex, separable) Hilbert space \({{{\mathcal {H}}}}\) is defined as a PVM \({{\textsf{P}}}: {\mathscr {B}}({{\mathbb {R}}}^3) \rightarrow {{\mathfrak {B}}}({{{\mathcal {H}}}})\)—where \({\mathscr {B}}({{\mathbb {R}}}^3) \) is the Borel \(\sigma \)-algebra of \({{\mathbb {R}}}^3\)—which is covariant with respect to a strongly continuous unitary representation V of the group of isometries \({{{\mathcal {E}}}}\) of \({{\mathbb {R}}}^3\) in \({{{\mathcal {H}}}}\):

$$\begin{aligned} V_g {{\textsf{P}}}(\Delta ) V_{g}^{-1} = {{\textsf{P}}}(g\Delta )\,, \quad \forall \Delta \in {\mathscr {B}}({{\mathbb {R}}}^3) \,, \,\forall g \in {{{\mathcal {E}}}}\,. \end{aligned}$$
(15)

\({{\mathbb {R}}}^3\) is above interpreted as the joint spectrum of three Newton–Wigner position self-adjoint operators

$$\begin{aligned} R_k:= \int _{{{\mathbb {R}}}^3} x_k \textrm{d}{{\textsf{P}}}(x_1,x_2,x_3)\,, \quad k=1,2,3. \end{aligned}$$
(16)

Remark 7

Wightman, on an account of Mackey’s imprimitivity systems theory, established the uniqueness of a Newton–Wigner position observable of a given unitary and strongly-continuous representation V of the Euclidean group \({{{\mathcal {E}}}}\) under suitably regularity requirements on V and invariance under time-reversal symmetry. A more recent discussion appears in [8]. For a technically extensive discussion concerning relativistic systems with every value of the square mass (also understood as an operator) and the spin, see [8, 9]. \(\blacksquare \)

According to the general interpretation of the formalism, the physical interpretation of a Newton–Wigner PVM is that \(\langle \psi | {{\textsf{P}}}(\Delta )\psi \rangle \) is the probability to find the particle in the region \(\Delta \subset {{\mathbb {R}}}^3\) when the pure state is represented by \(\psi \in {{{\mathcal {H}}}}\).

In the case of the real scalar Klein–Gordon particle, a Newton–Wigner PVMFootnote 2\({{\textsf{Q}}}_{n,t}\) is constructed as follows on the rest 3-space \(\Sigma _{n,t}\) of a reference frame \(n\in {{\textsf{T}}}_+\). Here, the restriction V of \(U: IO(1,3)_+ \rightarrow {{\mathfrak {B}}}({{{\mathcal {H}}}})\) (7) to the Euclidean subgroup \({{{\mathcal {E}}}}_n\) (4) is used to implement Wightman’s definition. As before, events \(e\in {{\mathbb {M}}}\) are identified with vectors through \(x(e) = e-0 \in {\textsf{V}}\).

If \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) define

$$\begin{aligned} \left( {{\textsf{Q}}}_{n,t} (\Delta )\psi \right) (p)&:=\int _\Delta \textrm{d}\Sigma _{n,t}(x) \int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(q) \frac{e^{-i(p-q)\cdot x}}{(2\pi )^3} \sqrt{E_n(p)E_n(q)} \psi (q)\nonumber \\&\quad \text{ with } -n \cdot x=t. \end{aligned}$$
(17)

Above \(\Delta \in {\mathscr {B}}(\Sigma _{n,t})\) and \(\textrm{d}\Sigma _{n,t}(x)= d^3x\) in Minkowskian coordinates co-moving with n. As the mathematical tools appearing in the formula are coordinate independent for a choice of \(n\in {{\textsf{T}}}_+\), the operator on the left-hand side only depends on (nt). The found family of operators defines a Newton–Wigner observable on every slice \(\Sigma _{n,t}\) according to Wigner’s definition because of the following result.

Proposition 8

Each operator of the \((n,t, \Delta )\)-parametrized family (17) defined on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and taking values in \({{\mathcal {H}}}\), uniquely extends by continuity to the whole space \(\mathcal H\). The found family of operators, for \(t\in {{\mathbb {R}}}\) fixed, defines a PVM on \({\mathscr {B}}(\Sigma _{n,t})\) satisfying the covariance requirement (15) with respect to the group of isometries \({{{\mathcal {E}}}}_n\) (4) of \(\Sigma _{n,t}\).

If indicating the found orthogonal projectors with the same symbol \({{\textsf{Q}}}_{n,t}(\Delta )\), the action of \(IO(1,3)_+\) on them reads

$$\begin{aligned} U_{h} {{\textsf{Q}}}_{n,t}(\Delta ) U_{h}^{-1} = {{\textsf{Q}}}_{\Lambda _h n,t_h}(h\Delta ) \,, \quad \forall \Delta \in {\mathscr {B}}(\Sigma _{n,t})\,, \quad h \in IO(1,3)_+\,. \end{aligned}$$
(18)

Proof

Fix a Minkowskian coordinate system co-moving with n. Define the unitary map

$$\begin{aligned} S_n: L^2({\textsf{V}}_{m,+}, \mu _m) \ni \psi (p) \mapsto \frac{\psi (E_n(p), \vec {p}_n)}{\sqrt{E_n(p)}} \in L^2({{\mathbb {R}}}^3, d^3p)\,, \end{aligned}$$
(19)

where \(\vec {p}_n\equiv (p_1,p_2,p_3) \in {{\mathbb {R}}}^3\) according to the said choice of a Minkowskian coordinate system. Notice that, as \(m>0\), the written map restricts to a bijection from \({{{\mathcal {S}}}}(\mathcal{H})\), which is dense in \({{{\mathcal {H}}}}=L^2({\textsf{V}}_{m,+}, \mu _m)\), onto \({\mathscr {S}}({{\mathbb {R}}}^3)\) viewed as dense subspace of \(L^2({{\mathbb {R}}}^3, d^3p)\). We then have that, for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), (17) can be reformulated as

$$\begin{aligned} {{\textsf{Q}}}_{n,t} (\Delta )\psi = U_{(I, t n)} S_n^{-1} {{\mathcal {F}}}\,1_\Delta \, {{\mathcal {F}}}^{-1} S_n U^{-1}_{(I, t n)}\,\psi \end{aligned}$$
(20)

Above, \(1_\Delta \) is the multiplicative operator with the characteristic function of \(\Delta \in {{\mathbb {R}}}^3 \equiv \Sigma _{n,t}\) (\(1_\Delta ({x})=1\) if \({x}\in \Delta \) and \(1_\Delta ({x})=0\) otherwise); \({{\mathcal {F}}}: L^2(\Sigma _{n,t}, \textrm{d}\Sigma _{n,t}) \rightarrow L^2({{\mathbb {R}}}^3, d^3p)\) is the Fourier–Plancherel unitary transform (after having identified \(\Sigma _{n,t}\) with \({{\mathbb {R}}}^3\) and \(\textrm{d}\Sigma _{n,t}\) with the Lebesgue measure \(d^3x\) with the same a choice of a Minkowskian coordinate system as above). \({{\mathcal {F}}}\) and its inverse preserve the Schwartz space. The map \({\mathscr {B}}({{\mathbb {R}}}^3) \ni \Delta \mapsto 1_\Delta \in {{\mathfrak {B}}}(L^2({{\mathbb {R}}}^3,d^3x))\) is evidently a PVM in the written Hilbert space. As \({{\mathcal {F}}}^{-1} S_n\) is norm preserving, and when restricted to the dense subspace of Schwartz functions has a dense range, \( S_n^{-1} {{\mathcal {F}}}\, 1_\Delta \, {{\mathcal {F}}}^{-1} S_n|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})}\), extends to a bounded operator everywhere defined which is also a PVM. Identity (15) is an immediate consequence of (18) when \({{{\mathcal {E}}}}_3\) is identified with \({{{\mathcal {E}}}}_n\) (4). Let us prove (18). From (20), for \(\psi ,\psi '\in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), the Fubini and Tonelli theorems yield

$$\begin{aligned}{} & {} \langle \psi '| {{\textsf{Q}}}_{n,t}(\Delta ) \psi \rangle \nonumber \\{} & {} = \int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(p) \overline{\psi (p)'}\int _\Delta \textrm{d}\Sigma _{n,t}(x) \int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(q) \frac{e^{-i(p-q)\cdot x}}{(2\pi )^3} \sqrt{E_n(p)E_n(q)} \psi (q)\nonumber \\{} & {} =\int _{\Sigma _{n,t}} \textrm{d}\Sigma _{n,t}(x) 1_{\Delta }(x)\int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(p) \frac{e^{-i p\cdot x} \sqrt{E_n(p)}}{(2\pi )^{3/2}} \overline{\psi '(p)} \int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(q) \frac{e^{iq\cdot x}\sqrt{E_n(q)}}{(2\pi )^{3/2}} \psi (q)\nonumber \\ \end{aligned}$$
(21)

where \(-n\cdot x= t\) and the integrals are interpreted in proper sense. Let us define

$$\begin{aligned} f_n(x):=\int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(p) \frac{e^{-i p\cdot x} \sqrt{E_n(p)}}{(2\pi )^{3/2}} \overline{\psi '(p)} \int _{{\textsf{V}}_{m,+}} \textrm{d}\mu _m(q) \frac{e^{iq\cdot x}\sqrt{E_n(q)}}{(2\pi )^{3/2}} \psi (q)\,. \end{aligned}$$

At this juncture, taking advantage of (8) and observing that the \({{\mathbb {M}}}\)-isometry invariance of the measures induced by the metric \(\textrm{d}\Sigma _{n,t}(x) = dh\Sigma _{n,t}(hx) \) entails, for \(h\in IO(1,3)_+\)

$$\begin{aligned}{} & {} \int _{x\in \Sigma _{n,t}} \textrm{d}\Sigma _{n,t}(x) 1_{\Delta }(x) f_{\Lambda _hn}(h x) = \int _{h x\in h \Sigma _{n,t}} dh \Sigma _{n,t}(h x) 1_{\Delta }(x) f_{\Lambda _hn}(h x)\\{} & {} \quad = \int _{y\in h \Sigma _{n,t}} dh\Sigma _{n,t}(y) 1_{\Delta }(h^{-1} y) f_{\Lambda _hn}(y) = \int _{y\in h \Sigma _{n,t}} dh \Sigma _{n,t}(y) 1_{h \Delta }(y) f_{\Lambda _hn}(y)\\{} & {} \quad = \int _{\Sigma _{\Lambda _h n,t_h}} \textrm{d}\Sigma _{\Lambda _h n,t_h}(y) 1_{h\Delta }(y) f_{\Lambda _hn}(y) = \int _{\Sigma _{\Lambda _h n,t_h}} \textrm{d}\Sigma _{\Lambda _h n,t_h}(x) 1_{h\Delta }(x) f_{\Lambda _hn}(x)\,. \end{aligned}$$

The found identity used in (21) and taking (7) into account leads to

$$\begin{aligned} \langle \psi '| (U_{h} {{\textsf{Q}}}_{n,t}(\Delta ) U_{h}^{-1} - {{\textsf{Q}}}_{\Lambda _h n,t_h}(h\Delta )) \psi \rangle =0\quad \text{ if } \psi ,\psi '\in \mathcal{S}({{{\mathcal {H}}}}). \end{aligned}$$

Since \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is dense in \({{{\mathcal {H}}}}\) and the operators are bounded and everywhere defined, the found identity extends to the general case \(\psi ,\psi '\in {{{\mathcal {H}}}}\) ending the proof. \(\square \)

Definition 9

The family \(\{{{\textsf{Q}}}_{n,t}(\Delta )\}_{\Delta \in {\mathscr {B}}(\Sigma _{n,t})}\) constructed in Proposition 8 is the Newton–Wigner PVM of the massive Klein–Gordon particle in the reference frame n at time t. The collection \({{\textsf{Q}}}\) of all these PVMs when \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\) is the Newton–Wigner spatial localization observable.

Remark 10

  1. (1)

    In view of the \(IO(1,3)_+\) covariance and (10)

    $$\begin{aligned} {{\textsf{Q}}}_{n,t}(\Delta +t) = U_t^{(n)\dagger } {{\textsf{Q}}}_{n,0}(\Delta ) U_t^{(n)}\,,\quad \forall t\in {{\mathbb {R}}}\,\,, \forall \Delta \in {\mathscr {B}}(\Sigma _{n,0})\,. \end{aligned}$$
    (22)

    In other words, the Newton–Wigner PVM at time t in n is the Heisenberg evolution of the one at time zero according to the time evolutor in the reference frame n.

  2. (2)

    The non-relativistic limit for a state \(\psi \in \mathcal{H}\), in a reference frame \(n\in {{\textsf{T}}}_+\), can be viewed as the requirement that \(|\psi (p)|\) vanishes outside a region where \(|\vec {p}_n|\) is strictly narrowed around m. It is easy to see from (12) that, in this situation, \(m\varphi _\psi \) tends to become a standard Schrödinger wavefunction for a free particle of mass m. The use of same type of states in (21) shows that \(\langle \psi | {{\textsf{Q}}}_{n,0}(\Delta )\psi \rangle \) tends to the probability of finding the particle in \(\Delta \) (at \(t=0\)) according to the standard non-relativistic position PVM on the said state \(\psi \).

  3. (3)

    There is, however, another regime where the Newton–Wigner PVM approximates the PVM of the classical position observable. It is when \(\psi \) is sharply narrowed around a value of the momentum \(p_0\). In that case, similarly to before, \(E(p_0)\varphi _\psi \) tends to become a standard Schrödinger wavefunction for a free particle of mass m and \(\langle \psi | {{\textsf{Q}}}_{n,0}(\Delta )\psi \rangle \) tends to the probability of finding the particle in \(\Delta \) (at \(t=0\)) according to the standard non-relativistic position PVM. \(\blacksquare \)

3.2 NW localization does not mean localized covariant wavefunctions: antilocality

I am in a position to illustrate an annoying fact which sharply distinguishes the relativistic and the non-relativistic theory. Newton–Wigner localization in a bounded set \(\Delta \subset \Sigma _{n,t}\) for a state \(\psi \) implies that the associated wavefunction \(\varphi _\psi \) is essentially supported also outside \(\Delta \) itself at time t.

Choose a reference frame n and a co-moving Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\) and wrote \(\vec {x}:= (x^1,x^2,x^3)\). Looking at (20), if \(\psi \in {{{\mathcal {H}}}}\),

$$\begin{aligned} \Psi _t:=\left( {{{\mathcal {F}}}}^{-1}S_n U^{-1}_{(I,tn)} \psi \right) \in L^2({{\mathbb {R}}}^3,d^3x) \end{aligned}$$
(23)

Notice that \(\Psi _t \in {\mathscr {S}}({{\mathbb {R}}}^3)\) if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) where \({{\mathbb {R}}}^3\) identifies with \(\Sigma _{n,t}\). On account of (20), the action of \({{\textsf{Q}}}_{n,t}(\Delta )\) on \(\Psi \) is trivially the multiplication with \(1_{\Delta }(\vec {x})\). On the other hand, the definition of covariant wavefunction associated to a state (12) can be re-formulated in terms of \(\Psi \):

$$\begin{aligned} \varphi _{\psi }(t,\vec {x}):= (\overline{- \Delta + m^2})^{-1/4}\Psi _t(\vec {x})\,, \qquad \psi \in {{{\mathcal {H}}}}\,. \end{aligned}$$
(24)

This definition is valid for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) as the original version (12) is. However, as indicated, it can be trivially extended to the general case \(\psi \in {{{\mathcal {H}}}}\), since the self-adjoint operator \((\overline{- \Delta + m^2})^{-1/4}\) is bounded and everywhere defined in \(L^2({{\mathbb {R}}}^3, d^3x)\). In that case, the covariant wavefunction satisfiesFootnote 3\(\varphi _\psi (t,\cdot ) \in L^2({{\mathbb {R}}}^2,d^3x)\). A crucial property known as antilocality [30, 35] of \((\overline{- \Delta + m^2})^{\alpha }\) plays a fundamental role in the rest of the paper.

Theorem 11

Let \(k \in {{\mathbb {N}}}\), \(m>0\), and suppose that \({{\mathbb {R}}}\ni \alpha \not \in {{\mathbb {Z}}}\). If both \(\Psi \in L^2({{\mathbb {R}}}^k, d^kx)\) and \((\overline{- \Delta + m^2})^{\alpha } \Psi \) vanish a.e. with respect to \(d^kx\) in an open non-empty set \(\Omega \subset {{\mathbb {R}}}^k\)—assuming \(\Psi \in D((\overline{- \Delta + m^2})^{\alpha })\) for \(\alpha >0\)—then \(\Psi =0\) in \( L^2({{\mathbb {R}}}^k, d^kx)\).

This theorem together with Eq.(24) permits to prove a well-known annoying fact regarding spatial localization according to NW: localized states do not correspond to localized covariant wavefunctions (item (2) below).

Proposition 12

Let us consider the Newton–Wigner localization observable \({{\textsf{Q}}}\) of a massive Klein–Gordon particle. The following facts are true for given \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\).

  1. (1)

    \({{\textsf{Q}}}_{n,t}(\Delta )=0\) if and only if \(\Delta \) has zero measure with respect to \(\textrm{d}\Sigma _{n,t}\).

  2. (2)

    Let \(\psi \in {{{\mathcal {H}}}}\setminus \{0\}\) be localized in a spatial region \(\Delta \in {\mathscr {B}}(\Sigma _{n,t})\), i.e.,

    $$\begin{aligned} {{\textsf{Q}}}_{n,t}(\Delta )\psi = \psi \,. \end{aligned}$$

    If \(\Delta \) is not dense (in particular, if \(\Delta \) is bounded) then \(\varphi _\psi (t, \cdot )\) cannot vanish a.e. in every fixed non-empty open subset of \(\Sigma _{n,t}\setminus \Delta \).

Proof

(1) is obvious since, under unitary equivalence, \({{\textsf{Q}}}_{n,t}(\Delta )\) is the multiplicative operator \(1_{\Delta }\). Let us pass to (2). Suppose that \(\Delta \) is not dense and consider an open non-empty set \(\Omega \subset \Sigma _{n,t}\setminus \Delta \). \({{\textsf{Q}}}_{n,t}(\Delta )\psi = \psi \) is equivalent to \(1_{\Delta }\Psi _t = \Psi _t\) a.e. with respect to \(\textrm{d}\Sigma _{n,t}\), in particular, \(\Psi _t(\vec {x})=0\) a.e. in \(\Omega \). If also \(\varphi _\psi (t, \vec {x})= (\overline{- \Delta + m^2})^{-1/4}\Psi _t(\vec {x})=0\) a.e. for \(\vec {x} \in \Omega \), Theorem 11 applied to \(\Psi = \Psi _t\) for \(\alpha =-1/4\) would imply \( \Psi _t=0\), namely \(\psi =0\). This is impossible because \(||\psi || \ne 0\). \(\square \)

3.3 The Newton–Wigner position self-adjoint operator

I pass to define the Newton–Wigner position self-adjoint operators. Given a reference frame \(n\in {{\textsf{T}}}_+\), choose a co-moving Minkowskian coordinate system \(t:=x^0,x^1,x^3,x^3\). Following [31, 38], I define the Newton–Wigner position self-adjoint operators in n associated to a co-moving Minkowskian coordinate system with coordinates \((t:=x^0,x^1,x^3,x^3)\),

$$\begin{aligned} N_{n,t}^\alpha := \int _{\Sigma _{n,t}} x^\alpha \textrm{d}{{\textsf{Q}}}_{n,t}(x)\quad \alpha = 0,1,2,3\,, \end{aligned}$$
(25)

where the integration is the standard one according to a PVM (see, e.g., [29]).

Proposition 13

The Newton–Wigner position self-adjoint operators (25) satisfy the following.

  1. (1)

    \(\sigma ( N_{n,t}^\alpha ) = \sigma _c( N_{n,t}^\alpha ) = {{\mathbb {R}}}\) for every \(\alpha = 0,1,2,3\).

  2. (2)

    It holds \(D( N_{n,t}^\alpha ) \supset {{{\mathcal {S}}}}(\mathcal{H})\) and more strongly

    $$\begin{aligned} N_{n,t}^\alpha ( {{{\mathcal {S}}}}({{{\mathcal {H}}}})) \subset {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,, \end{aligned}$$
    (26)

    and \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is a core for all those operators.

  3. (3)

    The Heisenberg commutation relations hold, where \(k,h=1,2,3\):

    $$\begin{aligned} {[} N_{n,t}^k, N_{n,t}^h]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =[ P_{n h}, P_{n k}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =0\,, \qquad [ N_{n,t}^k, P_{n h}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} = i\delta ^k_hI|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} \nonumber \\ \end{aligned}$$
    (27)

    so that, in particular, the statement of the Heisenberg principle holds for \(h=1,2,3\):

    $$\begin{aligned} \Delta _\psi N^k_{n,t} \Delta _\psi P_{nk} \ge 1/2\,, \quad \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,. \end{aligned}$$
    (28)
  4. (4)

    The Heisenberg time evolution relation is valid:

    $$\begin{aligned} U^{(n)\dagger }_t N_{n,0}^kU^{(n)}_t\psi = N_{n,t}^k\psi = N_{n,0}^k\psi + t\frac{P_{nk }}{P_{n0}}\psi \quad \text{ for } \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \text{ and } k=1,2,3\,. \nonumber \\ \end{aligned}$$
    (29)
  5. (5)

    \(IO(1,3)_+\) covariance relations are true, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(IO(1,3)_+ \ni h= (\Lambda _h, a_h)\),

    $$\begin{aligned} U_h N_{n,t}^\alpha U_h^{-1} \psi = (\Lambda ^{-1}_h)^\alpha _\beta ( N^{\beta }_{\Lambda _h n, t_{h}} - a_h^\beta I)\psi , \quad \forall h \in IO(1,3)_+\,. \nonumber \\ \end{aligned}$$
    (30)

Proof

See “Appendix A”. \(\square \)

If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), property (4) implies that the maps \({{\mathbb {R}}}\ni t \mapsto \langle \psi | N^{\alpha }_{n,t} \psi \rangle \in {{\mathbb {R}}}^4 \equiv {{\mathbb {M}}}\), \(\alpha =0,1,2,3\) is the coordinate description of a timelike curve, i.e., the time evolution of a point in the rest space of n with speed that is strictly less than the light speed. In fact, the following corollary holds which strongly relies on the overall initial hypothesis \(m>0\). That is a sort of Ehrenfest theorem for the position of a massive free Klein–Gordon particle.

Corollary 14

Let \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) satisfy \(||\psi ||=1\). The expectation values of the Newton–Wigner position self-adjoint operators \( N^1_{n,t}, N^2_{n,t}, N^3_{n,t}\) (of a reference frame \(n\in {{\textsf{T}}}_+\) with a co-moving Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\)) describe a timelike worldline since

$$\begin{aligned} \sum _{k=1}^3 \left( \frac{d}{dt} \langle \psi | N^{k}_{n,t} \psi \rangle \right) ^2 < 1 \,. \end{aligned}$$
(31)

Proof

See “Appendix A”. \(\square \)

I stress that the found result together with the covariance properties stated in Propositions 8 and 13 suggests that the Newton–Wigner position localization observable possesses important physically sound features which should be preserved in any improvement of this sort of formalization. On the other hand, some substantial improvement is also necessary because, as we shall see shortly, the Newton–Wigner localization also suffers for physically insurmountable issues related to causality.

4 Problems with spatial localization

This section is devoted to examine the consequences on the Newton–Wigner position localization observable of an important general result by Hegerfeldt [20, 21] that, at the end of the play, rules out it. The analysis only concerns the issue (I1) presented in the introduction and extends to more general notions of spatial localization based on POVMs rather than PVM.

I stress that I will stick to the basic version of Hegerfeldt’s result. A modern formulation, which improves original Hegerfeldt’s ideas, appears in [8, 9].

4.1 Castrigiano’s causality requirement

Suppose that an one-particle Klein–Gordon pure state represented by \(\psi \in {{{\mathcal {H}}}}\), with \(||\psi ||=1\), defines a family \(\mu ^\psi \) of probability measures \(\mu ^\psi _{n,t}: {\mathscr {L}}(\Sigma _{t,n}) \rightarrow [0,1]\)—where \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\)—such that \(\mu ^\psi _{n,t}(\Delta )\) represents the probability of detecting the particle in \(\Delta \subset \Sigma _{n,t}\). I will call this collection a family of spatial localization probability measures associated to the state \(\psi \). How this association is implemented will be discussed later.

A physically meaningful requirement on families of spatial localizations was explicitly introduced by CastrigianoFootnote 4 in [8] and therein deeply analyzed in the case of particles with spin (within the more elaborated notion of causal system). Castrigiano’s requirement was actually formulated in terms of POVMs I will introduce later. Here I adopt a definition in terms of families of probability measures which is equivalent to Castrigiano’s one as soon as one passes to deal with POVMs.

The next definition illustrates Castrigiano’s causality requirement corresponding to item (b) in the definition below. The notion of causal time evolution presented in (a) was also introduced by Castrigiano. I stress that the distinction between of (a) and (b) is just functional to this study, though the validity of (a) is an evident consequence of (b)Footnote 5 which is the causality condition introduced in [8].

Definition 15

Let

$$\begin{aligned} \mu ^\psi := \{\mu ^\psi _{n,t}: {\mathscr {L}}(\Sigma _{t,n}) \rightarrow [0,1]\}_{n\in {{\textsf{T}}}_+, t\in {{\mathbb {R}}}} \end{aligned}$$

be the family of spatial localization probability measures of a pure state represented by \(\psi \in {{{\mathcal {H}}}}\) with \(||\psi ||=1\).

  1. (a)

    A given \(n\in {{\textsf{T}}}_+\) defines a causal time evolution if, for every \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\),

    $$\begin{aligned} \mu ^\psi _{n,t}(\Delta ) \le \mu ^\psi _{n,t'}(\Delta ') \quad \forall t'\in {{\mathbb {R}}}\,. \end{aligned}$$
    (32)

    where \(\Delta ':= \left( J^+(\Delta ) \cup J^-(\Delta ) \right) \cap \Sigma _{n,t'}\).

  2. (b)

    (Castrigiano’s causality requirement) The full family \(\mu ^\psi \) is causal if, for every \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), it holds

    $$\begin{aligned} \mu ^\psi _{n,t}(\Delta ) \le \mu ^\psi _{n',t'}(\Delta ') \quad \forall n, n'\in {{\textsf{T}}}_+ \,, \forall t, t'\in {{\mathbb {R}}}\,, \end{aligned}$$
    (33)

    where \(\Delta ':= \left( J^+(\Delta ) \cup J^-(\Delta ) \right) \cap \Sigma _{n',t'}\).

Remark 16

  1. (1)

    The reason why I passed from \({\mathscr {B}}(\Sigma _{t,n})\) to \({\mathscr {L}}(\Sigma _{t,n})\) is that, if \(\Delta \in {\mathscr {B}}(\Sigma _{t,n})\) then it may happen that \(\Delta ' \not \in {\mathscr {B}}(\Sigma _{t',n'})\). Vice versa, if \(\Delta \subset \Sigma _{n,t}\) (non-necessarily Lebesgue measurable!), then \(\Delta ' \in {\mathscr {L}}(\Sigma _{t',n'})\) for every \(n'\ne n\) and \(t,t'\in {{\mathbb {R}}}\) as established in Lemma 16 [8].

  2. (2)

    Evidently, the validity of (b) implies that (a) holds for every choice of \(n\in {{\textsf{T}}}_+\). However, if (a) is true for all \(n\in {{\textsf{T}}}_+\), (b) can be false in principle.

  3. (3)

    The definition of causal family of spatial localizations is symmetric under time reversal, i.e., it also consider \(J^-(\Delta )\). This is because, if interpreting the probability as a density of particles, the particles which reached \(\Delta \) at time t must have passed through \(J^-(\Delta )\cap \Sigma _{n',t'}\) for every rest space \(\Sigma _{n',t'}\) in the past of \(\Delta \). There are intermediate situations where the intersection of \(\Sigma _{n',t'}\) and \(\Sigma _{n,t}\) includes \(\Delta \) but they can be treated separately by dividing the particles into two cases. \(\blacksquare \)

4.2 Justification of the causal condition in the special case of sharp localization

The condition (b) above seems physically reasonable. However, it is not obvious how to justify it within the framework of this work (and the analogous ones), as everything should be justified within the framework of the issue (I1) disregarding (I2). In other words, I should not to refer to any issue concerning post-measurement states, but I have to stick to a unique given family \(\mu ^\psi \). I can at most perform one position measurement because, after a measurement, referred to the state \(\psi \) and the family \(\mu ^\psi \), the state changesFootnote 6\(\psi \rightarrow \psi '\) and the family \(\mu ^\psi \) changes accordingly \(\mu ^\psi \rightarrow \mu ^{\psi '}\), into a way I cannot control without a precise choice of the post-measurement state. Instead, Definition 15 considers a unique family \(\mu ^\psi \).

There is a case, however, where a justification of the requirements in the above definition is sufficiently easy even referring to a unique family \(\mu ^\psi \) (one measurement procedure only). Let us illustrate how the failure of condition (a) (thus (b)) for a choice of \(n\in {{\textsf{T}}}_+\) would permit superluminal transmission of information in the special case where there are states strictly localized at time t in some bounded regions \(\Delta \). In other words, \(\mu ^\psi _{n,0}(\Delta )=1\) in the reference frame \(n\in {{\textsf{T}}}_+\). This justification does not need to tackle the issue of the post-measurement state.

Consider two types of Klein Gordon particles with masses \(m_1 \ne m_2\), respectively, and collect, at \(t=0\), a large number of these particles (of the two types) in a box at rest in \(\Sigma _{n,0}\). We can image the box as the bounded region \(\Delta \subset \Sigma _{n,0}\). I assume that it is possible to open the box only for the mass \(m_1\) or mass \(m_2\) particles with some sort of filter. Next the procedure is

  1. (1)

    I make a decision about which type of particles (\(m_1\) or \(m_2\)) to free from \(\Delta \) at time \(t=0\) and I free it;

  2. (2)

    somebody detects the particles in \(\Sigma _{n,t}\) at time \(t>0\) and observes the value of the mass.

If (32) failed, a particle could be detected in the region \(\Delta ' \subset \Sigma _{n,t}\) with \(\Delta ' \cap J^+(\Delta )= \varnothing \), and this procedure would manage to transmit the information about my mass choice made in the spatial region \(\Delta \) at time \(t=0\) outside the causal future of this event!

The crucial point in the above discussion is that some states are at disposal whose probability measure at \(t=0\) is zero outside the bounded region \(\Delta \).

Very unfortunately, as I will discuss shortly, sharply localized position probabilities are ruled out by the Hegerfeldt theorem.

The above justification of the causality condition (b) which only relies on (I1) does not seem to be that easy to re-propose if referring to families \(\mu ^\psi \) which are not sharply localized (see Sect. 5.3). In this case, as discussed in the rest of this work, the position observable is described in terms of a POVM instead a PVM. In principle, in the absence of sharply localized states, one may try to use again an analogous argument where, at time \(t=0\), two types of bosons stay in a box with a certain very large probability. Opening the box for only one kind of boson should be formalized in terms of suitable quantum operations [6], not necessarily trace preserving, which define the quantum states of the two types of particles at time \(t>0\). Here, precise theoretical choices seem to be necessary and the elementary setting of (I1) does not seem to be sufficient.

This matter deserves further attention, but in this paper I will be content with assuming Castrigiano’s causality requirement and the consequent notion of causal time evolution as natural ideas.

4.3 Spatial localization in terms of POVMs

As is known, (see, e.g., [29]), if \(A:{{{\mathcal {H}}}} \rightarrow {{{\mathcal {H}}}}\), then \(A\ge 0\) means \(\langle \psi |A\psi \rangle \ge 0\) for all \(\psi \in {{{\mathcal {H}}}}\). This requirement for A is equivalent to \(A=A^\dagger \in {{\mathfrak {B}}}({{{\mathcal {H}}}})\) and \(\sigma (A) \subset [0,+\infty )\). Finally, if also \(B: {{{\mathcal {H}}}}\rightarrow {{{\mathcal {H}}}}\), then \(A\ge B\) means \(A-B \ge 0\).

An effect (see [6] for a modern up-to-date textbook on the subject) is a bounded operator \({{\textsf{E}}}\in {{\mathfrak {B}}}(\mathcal{H})\), for a Hilbert space \({{{\mathcal {H}}}}\), such that \(0\le {{\textsf{E}}}\le I\). \({{\mathfrak {E}}}({{{\mathcal {H}}}})\) will indicate henceforth the set of effects in \({{{\mathcal {H}}}}\). An orthogonal projector is an effect but there are effects which are not orthogonal projectors.

A (normalized) Positive Operator Valued Measure (POVM) is a map

$$\begin{aligned} \Sigma (X) \ni \Delta \mapsto {{\textsf{E}}}(\Delta ) \in {{\mathfrak {E}}}({{{\mathcal {H}}}})\,, \end{aligned}$$

where \(\Sigma (X)\) is a \(\sigma \)-algebra on X, such that the function is (see Def. 4.5 in [6] and the remarks under that definition)

  1. (a)

    normalized: \({{\textsf{E}}}(X) =I\);

  2. (b)

    \(\sigma \)-additive: \(\sum _{n \in {{\mathbb {N}}}} {{\textsf{E}}}(\Delta _n) = {{\textsf{E}}}(\cup _{n\in {{\mathbb {N}}}}\Delta _N)\) when \(\Delta _n\cap \Delta _m = \varnothing \) for \(n\ne m\) and the sum is understood in the weak (or equivalently strong) operator topology.

Notice that (a) and (b) imply, in particular, that \({{\textsf{E}}}(\varnothing )=0\). Furthermore (b) can be equivalently replaced by the requirement that \(\Sigma (X) \ni \Delta \mapsto \langle \psi | {{\textsf{E}}}(\Delta ) \psi ' \rangle \) is a complex measure (with finite total variation) for every \(\psi ,\psi ' \in {{{\mathcal {H}}}}\).

Remark 17

  1. (1)

    It is clear that a PVM is a specific case of POVM where the positive operators \({{\textsf{E}}}(\Delta )\) are orthogonal projectors.

  2. (2)

    A POVM does not satisfy in general \([{{\textsf{E}}}(\Delta ),{{\textsf{E}}}(\Delta ')]=0\) for \(\Delta \cap \Delta ' = \varnothing \) contrarily to what happens for a PVM.

  3. (3)

    The one-to-one link between self-adjoint operators and PVMs does not hold in case of POVMs. Something remains, however, since under some technical hypotheses a POVM is uniquely determined by a symmetric operator, in terms of the first moment of the POVM, as I will briefly discuss later. This fact, the failure of the hypotheses for that property, will play some role in this paper. \(\blacksquare \)

The general notion of observable, in the modern approaches to Quantum Theory, is a (normalized) POVM on a \(\sigma \)-algebra \(\Sigma (X)\) and taking values in \({{\mathfrak {B}}}({{{\mathcal {H}}}})\), where \({{{\mathcal {H}}}}\) is the Hilbert space of the considered quantum system:

  1. (1)

    The elements \(\Delta \in \Sigma (X)\) are the outcomes of measurements and,

  2. (2)

    if \(\rho \) is a generally mixed state—a trace class, unit-trace positive operator in \({{\mathfrak {B}}}({{{\mathcal {H}}}})\); \(\Sigma (X) \ni \Delta \mapsto tr(\rho {{\textsf{E}}}(\Delta ))\) is the probability measure associated to these outcomes. It boils down to \(\Sigma (X) \ni \Delta \mapsto \langle \psi |{{\textsf{E}}}(\Delta ) \psi \rangle \) in case of a pure state represented by the unit vector \(\psi \in {{{\mathcal {H}}}}\).

Definition 18

A relativistic spatial localization observable for a Klein–Gordon particle of mass \(m>0\) described in the (complex, separable) Hilbert space \({{{\mathcal {H}}}}\) is defined as a family of normalized POVMs \({{\textsf{E}}}_{n,t}: {\mathscr {L}}(\Sigma _{n,t}) \rightarrow {{\mathfrak {E}}}({{{\mathcal {H}}}})\), where \(n\in {{\textsf{T}}}_+\) and \(t\in {{\mathbb {R}}}\), that is covariant with respect to the strongly continuous unitary representation U of \(IO(1,3)_+\) (7):

$$\begin{aligned} U_{h} {{\textsf{E}}}_{n,t}(\Delta ) U_{h}^{-1} = {{\textsf{E}}}_{\Lambda _h n,t_h}(h\Delta ) \,, \quad \forall \Delta \in {\mathscr {L}}(\Sigma _{n,t})\,, \quad h \in IO(1,3)_+\,. \end{aligned}$$
(34)

A very detailed technical analysis of the notion above (called Poincaré covariant POL therein) appears in Sects. 6 and 7 of [8] referring to a general system and establishing some extension and uniqueness properties from POVMs covariant under the Euclidean group to POVMs covariant under the full \(IO(1,3)_+\) group.

The use of POVMs defined on \({\mathscr {L}}(\Sigma _{n,t})\) is mandatory due to Remark 16.

With the same elementary procedure to complete positive measures, a POVM \({{\textsf{E}}}\) defined on \({\mathscr {B}}(\Sigma _{n,t})\) uniquely extends to a completion: another POVM \({\overline{{{\textsf{E}}}}}\), on a larger \(\sigma \)-algebra \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\) made of the unions of the elements of \({\mathscr {B}}(\Sigma _{n,t})\) with the subsets of the zero-\({{\textsf{E}}}\)-measure sets,

$$\begin{aligned} {\overline{{{\textsf{E}}}}}(\Delta \cup Z):= {{\textsf{E}}}(\Delta )\,, \quad \Delta \in {\mathscr {B}}(\Sigma _{n,t})\,, \quad Z \subset B \in {\mathscr {B}}(\Sigma _{n,t})\,, \quad {{\textsf{E}}}(B)=0\,. \end{aligned}$$

Exactly as in standard measure theory, \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\) is characterized by the fact that it is the smallest \(\sigma \)-algebra including \({\mathscr {B}}(\Sigma _{n,t})\) and equipped with an extension \({\overline{{{\textsf{E}}}}}\) of \({{\textsf{E}}}\) such that all subsets of zero-\({\overline{{{\textsf{E}}}}}\)-measure sets in \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\) belong to \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{E}}}}\).

Trivially, the outlined procedure extends a POVM which is a PVM to a completion that is a PVM as well. In particular, the completion of the previously discussed Newton–Wigner PVM turns out to be defined on \(\overline{{\mathscr {B}}(\Sigma _{n,t})}^{{{\textsf{Q}}}_{n,t}}= {\mathscr {L}}(\Sigma _{n,t})\) as a consequence of (1) Proposition 12 and elementary properties of the Lebesgue measure: \({\mathscr {L}}(\Sigma _{n,t})\ni \Delta \mapsto \overline{{{\textsf{Q}}}_{n,t}}(\Delta ) \in {{\mathfrak {E}}}({{{\mathcal {H}}}})\). This completion still satisfies the \(IO(1,3)_+\) covariance and all the properties established in the previous section as one immediately proves. In the rest of the paper, I will simply write \({{\textsf{Q}}}_{n,t}(\Delta )\) in place of \(\overline{{{\textsf{Q}}}_{n,t}}(\Delta )\) when \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\).

4.4 Troubles with Newton–Wigner and sharply localized states: the Hegerfeldt theorem

Hegerfeldt [21] proved the following quite devastating theorem against the Newton–Wigner notion of localization, in particular. I reformulate the result established in [21] into the language of Definition 15 and explicitly for a massive Klein–Gordon real spinless particle.

Theorem 19

(Hegerfeldt) Consider a spatial localization POVM of a massive Klein–Gordon particle according to Def. 18. Suppose that there are \(\psi \in {{{\mathcal {H}}}}\) with \(||\psi ||=1\) and \(e\in \Sigma _{n_e,t_e}\) such that the probability to find the particle outside the balls \(B_r(e) \subset \Sigma _{n_e,t_e}\) with common center e and variable radii \(r>0\) satisfies

$$\begin{aligned} \langle \psi | {{\textsf{E}}}_{n_e,t_e}(\Sigma _{n_e,t_e} \setminus B_r(e)) \psi \rangle \le K_1 e^{-K_2 r} \quad \text{ for } \text{ some } K_1>0, K_2 \ge 2m \text{ and } \text{ all } r>0\,. \end{aligned}$$

Then \(n_e\) cannot define a causal time evolution of the family of probability measures \(\mu ^{\psi }_{n,t}:= \langle \psi | {{\textsf{E}}}_{n,t}(\Delta ) \psi \rangle \) according to condition (a) in Def. 15.

A crucial corollary follows against the Newton–Wigner notion of spatial localization.

Corollary 20

The (completion of the) Newton–Wigner spatial localization observable does not satisfy Castrigiano’s causality condition, because (a) in Def. 15 fails for every choice of \(n\in {{\textsf{T}}}_+\).

Proof

Arbitrarily fix \(e\in \Sigma _{n_e,t_e}\), choose \(R>0\) and consider the orthogonal projector \({{\textsf{Q}}}_{n_e,t_e}(B_R(e))\). It holds \({{\textsf{Q}}}_{n_e,t_e}(B_R(e)) \ne 0\) due to (1) in Proposition 12, since an open has strictly positive measure \(\textrm{d}\Sigma _{n_e,t_e}\). Therefore, there exists \(\psi = {{\textsf{Q}}}_{n_e,t_e}(B_R(e))\psi \) with \(||\psi ||=1\). Evidently \(\langle \psi | {{\textsf{Q}}}_{n_e,t_e}(\Sigma _{n_e,t_e} {\setminus } B_r(e)) \psi \rangle =0\) if \(r>R\) since

$$\begin{aligned} {{\textsf{Q}}}_{n_e,t_e}(\Sigma _{n_e,t_e} \setminus B_r(e)) \psi = {{\textsf{Q}}}_{n_e,t_e}(\Sigma _{n_e,t_e} \setminus B_r(e)) {{\textsf{Q}}}_{n_e,t_e}(B_R(e)) \psi = {{\textsf{Q}}}_{n_e,t_e}(\varnothing )\psi =0\,. \end{aligned}$$

\(\psi \) satisfies the hypotheses of Hegerfeldt’s theorem with respect the family of balls \(B_r(e)\). Arbitrariness of \(n_e \in {{\textsf{T}}}_+\) concludes the proof. \(\square \)

An interesting paper by Ruijsenaars [34] presents some explicit numerical estimates of the probabilities of recording a violation of causality through measurements of the Newton–Wigner observable for a scalar Klein–Gordon massive particle.

It is evident that, on account of the corollary, Physics rules out the Newton–Wigner notion of localization because it does not satisfy a basic requirement about causality, in particular, taking Sect. 4.2into account. However, this is very disappointing because the Newton–Wigner position operator shows some natural and quite appealing features, as previously illustrated in Proposition 13 and its Corollary 14. This inconclusive asymmetry is very annoying and is certainly a reason why Newton Wigner’s notion of localization is still a subject of discussion in the literature. In the rest of the paper will see how it is possible to keep the good things (the position operator) and get rid of the bad ones (the PVM).

Remark 21

  1. (1)

    There are other, even more severe, problems with the Newton–Wigner notion of spatial localization and causality when one analyzes it on the ground of the issue (I2) of the introduction, by assuming the Lüders’ projection postulate about the post-measurement state.

  2. (2)

    The Newton–Wigner notion of spatial localization is acausal not only with respect to time translations but equally regarding boosts. This so-called frame dependence of Newton–Wigner localization has been observed already in [37] and it is still studied in the literature, e.g., [15]. It is obvious that any notion of spatial localization in terms of POVMs should meet the requirement of frame independence.

  3. (3)

    It is interesting to notice that the example of the rejection of the Newton–Wigner observable shows how the idea that every PVM/self-adjoint operator in the Hilbert space of a quantum system must be an observable is definitely untenable. However, to author’s knowledge this is the first time that, in quantum mechanics, the rejection of a self-adjoint operator as an observable in quantum mechanics is due to local causality and not to the existence of a gauge group or a superselection rule.

  4. (4)

    The above version of the Hegerfeldt theorem is the classic one, it explicitly refers to the Klein–Gordon particle and can be immediately extended to particles with spin. Actually it is not necessary that full covariance with respect to our representation of \(IO(1,3)_+\) holds. There are more abstract versions of this theorem that refer to abstract POVMs and rely only on (a) positivity of the self-adjoint generator of temporal translations and (b) covariance with respect to four translations. See, in particular, Theorem B1Footnote 7 in [1]. A throughout analysis of the interplay of spatial localization and Hamiltonian positivity appears in Sects. 4 and 5 of [8]. \(\blacksquare \)

5 The spatial localization observable proposed by Terno

In [8], Castrigiano proved that for spin 1/2 it is possible to define a spatial localization observable different from the Newton–Wigner one which satisfies the causality requirement (b) of Def. 15. That observable is a PVM if the positivity assumption on the Hamiltonian evolutor is not imposed and becomes a POVM when restricting to the subspace of positive energy. Unfortunately, that construction does not work for scalar Klein–Gordon particles as discussed in Sect. 23 of [8].

5.1 Terno’s POVM: the heuristic definition from QFT

Terno [36] introduced a position localization POVM starting from elementary notions of free QFT in Minkowski spacetime. Though that notion was also extended to photons in [36], here I stick to the case of a real scalar massive Klein–Gordon field.

I review the definition of that POVM in the formal language of theoretical physics of QFT first. Later I will translate it into a more mathematically rigorous setting. I start from the stress energy operator of QFT. Let

$$\begin{aligned}:{\hat{T}}_{\mu \nu }:(x):=\,:\partial _\mu {\hat{\phi }} \partial _\nu {\hat{\phi }}:(x) - \frac{1}{2} g_{\mu \nu } \left( :\partial _\alpha {\hat{\phi }}\partial ^\alpha {\hat{\phi }}:(x) + m^2:{\hat{\phi }}^2:(x)\right) \end{aligned}$$

be the coordinate representation of the normally ordered stress-energy tensor operator in the symmetric Fock space \({{\mathfrak {F}}}_+({{{\mathcal {H}}}})\) of the real Klein–Gordon field operator \({\hat{\phi }}\) with mass \(m>0\). Referring to a Minkowski coordinate system co-moving with \(n\in {{\textsf{T}}}_+\), if \(\Delta \subset \Sigma _{n,t}\), define

$$\begin{aligned} {{\textsf{A}}}_{n,t}(\Delta ):= \frac{1}{\sqrt{H_n}}P_1 \int _\Delta :{\hat{T}}_{\mu \nu }:(x) n^\mu n^\nu \, \textrm{d}\Sigma _{n,t}(x) P_1\frac{1}{\sqrt{H_n}} \,, \quad \text{ with } -n\cdot x = t, \end{aligned}$$
(35)

where \(P_1: {{\mathfrak {F}}}_+({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\) is the orthogonal projector onto the one-particle space of the symmetric Fock space \({{\mathfrak {F}}}_+({{{\mathcal {H}}}})\) constructed upon the Minkowski vacuum state with \({{{\mathcal {H}}}}\) as the one-particle subspace. Actually, the definition in [36] uses the total Hamiltonian in the Fock space and \(P_1\) is swapped with the inverse square root of the said Hamiltonian, but that definition is formally equivalent to that above.

Formally speaking, without paying attention to domains, as \(:{\hat{T}}_{\mu \nu }: (x) n^\mu n^\nu \) turns out to be positive, the integral is a positive operator so that \(0\le {{\textsf{A}}}_{n,t}(\Delta ) \le {{\textsf{A}}}_{n,t}(\Delta ')\) if \(\Delta \subset \Delta '\). The integral on the whole rest space amounts to

$$\begin{aligned} {{\textsf{A}}}_{n,t}(\Sigma _{n,t})= & {} H_n^{-1/2} P_1 \left[ 0 \oplus H_n \oplus (H_n\otimes I \oplus I \otimes H_n) \oplus \cdots \right] \\{} & {} P_1 H^{-1/2}_{n} = H_n^{-1/2} H_n H^{-1/2}_{n}=I\,. \end{aligned}$$

Hence, \(0\le {{\textsf{E}}}(\Delta )\le I\). \(\sigma \)-additivity with respect to \(\Delta \) is guaranteed by the very presence of the integration over \(\Delta \). As a matter of fact, barring mathematical details I will fix shortly, that is a (non-commutative) POVM.

A straightforward formal manipulation of the right-hand side of (35), yields also a natural \(IO(1,3)_+\) -covariance relation

$$\begin{aligned} U_h {{\textsf{A}}}_{n,t}(\Delta ) U^\dagger _h = {{\textsf{A}}}_{\Lambda _h n, t_h}(h\Delta ) \quad \text{ if } h\in IO(1,n)_+ \,. \end{aligned}$$

The physical idea behind Terno’s definition should be clear: probabilistically speaking, the particle stays where the energy is. This idea was previously formulated in [2], where, however, no explicit POVM was constructed. The crucial normalization factors \(H_n^{-1/2}\) were explicitly introduced in [36].

5.2 Terno’s spatial localization observable

Expanding the quantum field in modes as usual, a straightforward computation starting from (35) yields, for \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) and \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\),

$$\begin{aligned}{} & {} {{\textsf{A}}}_{n,t}(\Delta )\psi \nonumber \\{} & {} \quad :=\int _\Delta \textrm{d}\Sigma _{n,t}(x) \int _{{\textsf{V}}_{\mu ,+}} \textrm{d}\mu _m(p) \frac{e^{-i(q-p)\cdot x}}{(2\pi )^3} \frac{\left( E_n(p)E_n(q)+ \frac{1}{2}(p\cdot q+ m^2)\right) }{\sqrt{E_n(p)E_n(q)}} \psi (p)\nonumber \\{} & {} \qquad \text{ with } -n\cdot x = t, \end{aligned}$$
(36)

which I will assume to be the definition of the family of operators \({{\textsf{A}}}_{n,t}(\Delta )\), for \(n\in {{\textsf{T}}}_t\) and \(t\in {{\mathbb {R}}}\), on the domain \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\).

Theorem 22

Referring to a massive real Klein–Gordon particle, the family of operators \({{\textsf{A}}}_{n,t}(\Delta ): {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\) defined in (36) for \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) uniquely continuously extends to a POVM—we shall indicate with the same symbol—for every given pair nt. The following further facts are valid.

  1. (1)

    The family is covariant with respect to the strongly continuous unitary representation U of \(IO(1,3)_+\) (7):

    $$\begin{aligned} U_{h} {{\textsf{A}}}_{n,t}(\Delta ) U_{h}^{-1} = {{\textsf{A}}}_{\Lambda _h n,t_h}(h\Delta ) \,, \quad \forall \Delta \in {\mathscr {L}}(\Sigma _{n,t})\,, \quad \forall h \in IO(1,3)_+\,. \end{aligned}$$
    (37)

    and thus it defines a relativistic spatial localization observable.

  2. (2)

    Referring to the (Lebesgue-completion of the) Newton–Wigner spatial localization observable \({{\textsf{Q}}}_{n,t}\), the following identity is true

    $$\begin{aligned} {{\textsf{A}}}_{t,n}(\Delta ) = {{\textsf{Q}}}_{t,n}(\Delta ) + \frac{1}{2}\left( \eta ^{\mu \nu }\frac{P_{n\mu }}{H_n} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{P_{n\nu }}{H_n} + \frac{m}{H_n} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{m}{H_n} \right) \end{aligned}$$
    (38)

    for every \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\). (The various everywhere-defined bounded composite operators \(P_n^\mu /H_{n}\) and \(m/H_n\) are defined in terms of the joint spectral measure of \(P^\mu \) and with standard spectral calculus.)

Proof

Let us prove (1) and (2). Fix \(n\in {{\textsf{T}}}_+\) and \(t\in {{\mathbb {R}}}\). If \(\psi ',\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and we indicate by \(B\psi \) the right-hand side of (36) and by C the right-had side of (38), a straightforward computation that takes (21) into account proves that \(\langle \psi '| B\psi \rangle = \langle \psi '|C \psi \rangle \). Since \(\psi '\) varies in a dense set, the found identity implies that \(B\psi = C\psi \) for all \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). As C is continuous and everywhere defined on \({{{\mathcal {H}}}}\), we conclude that the operator defined in (36) uniquely extends by continuity to the operator in (38). On the other hand, since the operators \({{\textsf{Q}}}_{n,t}(\Delta )\) define a PVM, the structure of the right-hand side of (38), which can be re-arranged to

$$\begin{aligned} {{\textsf{A}}}_{t,n}(\Delta ) = \frac{1}{2}\left( {{\textsf{Q}}}_{t,n}(\Delta ) +\sum _{k=1}^3\frac{P_{nk}}{H_n} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{P_{nk}}{H_n} + \frac{m}{H_n} {{\textsf{Q}}}_{n,t}(\Delta ) \frac{m}{H_n} \right) \end{aligned}$$
(39)

defines a family of positive operators of \({{\mathfrak {B}}}({{{\mathcal {H}}}})\). Notice, in particular, that \( \frac{P_{n\nu }}{H_n}= \left( \frac{P_{n\nu }}{H_n}\right) ^\dagger \in {{\mathfrak {B}}}({{{\mathcal {H}}}})\) and \( \frac{m}{H_n}= \left( \frac{m}{H_n}\right) ^\dagger \in {{\mathfrak {B}}}(\mathcal{H})\). The family of operators in the right-hand side of (39), is also evidently weakly \(\sigma \)-additive in \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\). The constructed POVM is normalized because \({{\textsf{Q}}}_{n,t}\) is:

$$\begin{aligned} {{\textsf{A}}}_{t,n}(\Sigma _{n,t}) = {{\textsf{Q}}}_{t,n}(\Sigma _{n,t}) + \frac{1}{2}\left( \eta ^{\mu \nu }\frac{P_{n\mu }}{H_n}I \frac{P_{n\nu }}{H_n} + \frac{m}{H_n} I \frac{m}{H_n} \right) =I+ 0 =I\,. \end{aligned}$$

The proof of (37) is strictly analogous to the one of (18) or it can be established immediately from it by taking (38) into account and the obvious covariance properties of the operators \(P_{n\mu }\). \(\square \)

Definition 23

Referring to Theorem 22, we call each \({{\textsf{A}}}_{n,t}\) Terno’s spatial localization POVM in the reference frame \(n\in {{\textsf{T}}}_+\) at time \(t\in {{\mathbb {R}}}\). The family \({{\textsf{A}}}\) of POVMs \({{\textsf{A}}}_{n,t}\) will be named Terno’s spatial localization observable.

Remark 24

Contrarily to the case of the Newton–Wigner localization, covariance with respect to the spatial Euclidean subgroup is not sufficient to fix the structure of \( {{\textsf{A}}}_{n,t}\), since there are infinitely many POVMs with that covariance property with respect to a unitary strongly continuous representation of the Euclidean group [7]. \(\blacksquare \)

5.3 Almost localized states

The following proposition illustrates a fundamental difference between the notion of spatial localization by Newton–Wigner and the one by Terno: localized states in bounded regions are permitted by the former but are impossible for the latter. This implies, in particular, that the argument of Corollary 20—which ruled out the Newton–Wigner localization notion—cannot be directly applied to \({{\textsf{A}}}_{n,t}\). In [36], it is proved (exploiting an argument of [2]) that the spatial decay of the probability distribution arising from the POVM \({{\textsf{A}}}_{n,t}\) does not reach the bound sufficient to trigger Hegerfeld’s local-causality catastrophe. I will achieve that result indirectly, by establishing that the time evolution with respect to every \(n\in {{\textsf{T}}}_+\) is causal for the said POVM.

However, it is not the whole story. Indeed, the second statement of the next proposition shows that, for every (in particular, bounded) region \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) with non-empty interior, there are states which are arbitrary good approximations of states sharply localized in that region.

Proposition 25

Referring to the Terno spatial localization observable \({{\textsf{A}}}\), the following facts are true.

  1. (1)

    Suppose that \(\psi \in {{{\mathcal {H}}}}\) with \(||\psi ||=1\), \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) satisfy

    $$\begin{aligned} \langle \psi |{{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle =1\,. \end{aligned}$$

    In that case \(\Delta \) is dense in \(\Sigma _{n,t}\). In particular, \(\Delta \) cannot be bounded.

  2. (2)

    For every given \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\) and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\) with \(Int(\Delta )\ne \varnothing \), there is a sequence of vectors \(\{\psi _j\}_{j \in {{\mathbb {N}}}}\subset {{{\mathcal {H}}}}\) such that \(||\psi _j||=1\) and

    $$\begin{aligned} \langle \psi _j| {{\textsf{A}}}_{n,t}(\Delta )\psi _j\rangle \rightarrow 1\,, \quad \text{ as } j\rightarrow +\infty . \end{aligned}$$
  3. (3)

    For every given \(n\in {{\textsf{T}}}_t\), \(t\in {{\mathbb {R}}}\) and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), if \(Int(\Delta )\ne \varnothing \), then \(||{{\textsf{A}}}_{n,t}(\Delta )||=1\).

Proof

(1) Define \(\Delta ':=\Sigma _{n,t}\setminus \Delta \). By additivity, \(\langle \psi |{{\textsf{A}}}_{n,t}(\Delta ') \psi \rangle =0\). From (39) and the fact that \({{\textsf{Q}}}_{nt}\) is a PVM, \(\langle \psi |{{\textsf{A}}}_{n,t}(\Delta ') \psi \rangle =0\) can be rephrased to

$$\begin{aligned} \frac{1}{2}||{{\textsf{Q}}}_{n,t}(\Delta ')\psi ||^2 + \frac{1}{2} \sum _{k=1}^3|| {{\textsf{Q}}}_{n,t}(\Delta ')H_n^{-1}P_{nk}\psi ||^2 + \frac{m^2}{2} ||{{\textsf{Q}}}_{n,t}(\Delta ')H_n^{-1}\psi ||^2 =0 \,. \end{aligned}$$

In particular, \({{\textsf{Q}}}_{n,t}(\Delta ')\psi =0\) and \({{\textsf{Q}}}_{n,t}(\Delta ')H_n^{-1}\psi =0\). Using the representation (23) of the Hilbert space vectors, these requirements can be restated to \(1_{\Delta '}(\vec {x}) \Psi _t(\vec {x})=0\) and \(1_{\Delta '}(\vec {x}) (\overline{-\Delta + m^2I})^{-1/2}\Psi _t(\vec {x})=0\). Hence \(\Psi _t(\vec {x})=0\) and \( (-\Delta + m^2I)^{-1/2}\Psi _t(\vec {x})=0\) a.e. on \(\Delta '\). If \(\Delta '\) includes an open non-empty set, Theorem 11 would imply that \(\Psi _t=0\) which is not permitted by hypothesis.

(2) It is evidently sufficient to prove it for the special case \(\Delta = B_R\subset \Sigma _{n,t}\) given by an open ball of finite radius \(R>0\). Indeed, if \(\Delta \) admits non-empty interior, then \(\Delta \supset B_R\) for some such ball and thus \(0\le \langle \psi | {{\textsf{A}}}_{n,t}(B_R) \psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle \le 1\) if \(||\psi ||=1\). A sequence of localizing states \(\psi _j\) for \(B_R\) is also a sequence of localizing states for \(\Delta \). Finally, we can always assume \(t=0\) without lack of generality as the reader can immediately prove using a trivial time translation and exploiting the covariance properties of \({{\textsf{A}}}\). So we prove the thesis for the ball \(B_R\). Consider a \(C^\infty \) function \(\chi \ge 0\) on \(\Sigma _{n,0}\) with \(supp(\chi ) \subset B_R\). Let us identify \(\Sigma _{n,0}\) with \({{\mathbb {R}}}^3\) with a co-moving Minkowski coordinate system of n whose spatial origin is the center of \(B_R\). If \(\vec {a} \in {{\mathbb {R}}}^3\) is a fixed non-vanishing vector and \(j\in {{\mathbb {N}}}\),

$$\begin{aligned} {\hat{\chi }}_j(\vec {k}):= \frac{1}{(2\pi )^{3/2}}\int _{{{\mathbb {R}}}^3} e^{-i k\cdot x } e^{i j \vec {a} \cdot \vec {x} } \chi (\vec {x})\, d^3x \in {\mathscr {S}}({{\mathbb {R}}}^3)\,. \end{aligned}$$

Notice that the \(L^2\) norm of these vectors does not depend on j and is \(||\chi ||_{L^2({{\mathbb {R}}}^3, d^3x)}\). We can always choose \(\chi \) in order that \(||{\hat{\chi }}_j||_{L^2({{\mathbb {R}}}^3, d^3k)}=1 = ||\chi ||_{L^2({{\mathbb {R}}}^3, d^3x)}\) for all \(j\in {{\mathbb {N}}}\). Finally, define the family of the unit vectors \(\psi _j \in {{{\mathcal {H}}}}\),

$$\begin{aligned} \psi _j(k):=\sqrt{E_n(\vec {k})} {\hat{\chi }}_j(\vec {k}) \,,\quad j\in {{\mathbb {N}}}\,. \end{aligned}$$

From (17),

$$\begin{aligned} \langle \psi _j | {{\textsf{Q}}}_{n,0}(B_R) \psi _j \rangle = \int _{B_R} \overline{\chi (\vec {x})} \chi ({\vec x})d^3x = ||\chi ||^2_{L^2({{\mathbb {R}}}^3, d^3x)}=1\,. \end{aligned}$$

decomposing \(\langle \psi | {{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle \) as in (38), we have that \(\langle \psi | {{\textsf{A}}}_{n,0}(\Delta ) \psi \rangle - \langle \psi _j | {{\textsf{Q}}}_{n,0} (B_R) \psi _j \rangle \rightarrow 0\) because

$$\begin{aligned} \left\langle \psi _j \left| \left( \eta ^{\mu \nu }\frac{P_{n\mu }}{H_n} {{\textsf{Q}}}_{n,0}(B_R) \frac{P_{n\nu }}{H_n} + \frac{m}{H_n} {{\textsf{Q}}}_{n,0}(B_R) \frac{m}{H_n} \right) \right. \psi _j \right\rangle \rightarrow 0 \quad \text{ if } j\rightarrow +\infty \,. \end{aligned}$$
(40)

The proof of the limit above is postponed to “Appendix A”. This concludes the proof of (2), because \(\langle \psi _j | {{\textsf{Q}}}_{n,0}(B_R) \psi _j \rangle =1\) as said above.

(3) is an easy consequence of (2), \(0\le {{\textsf{A}}}_{n,t}(\Delta )= {{\textsf{A}}}_{n,t}(\Delta )^\dagger \le I\) and \(||{{\textsf{A}}}_{n,t}(\Delta )|| = \sup \{|\langle \psi | {{\textsf{A}}}_{n,t}(\Delta ) \psi \rangle |\,|\, ||\psi ||=1\}\). \(\square \)

5.4 Interplay of the first-moment operator of \({{\textsf{A}}}\) and the NW position operator

I can now pass to introduce the first moment of Terno’s POVM, a symmetric operator. I will prove, in particular, that its closure coincides with the Newton–Wigner position operator, so that it preserves all the good properties of the Newton–Wigner position operator.

Theorem 26

Take \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), choose a co-moving Minkowski coordinate system \(x^0=t,x^1,x^2,x^3\). There is only one operator \(X^\mu _{n,t}: {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\), for every \(\mu := 0,1,2,3\), completely defined as the first moment of the POVM \({{\textsf{A}}}_{t,n}\):

$$\begin{aligned} \langle \psi | X^\mu _{n,t} \psi \rangle := \int _{\Sigma _{n,t}} x^\mu d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle \,, \quad \forall \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\quad \text{ and } \text{ where } -n\cdot x=t\,.\nonumber \\ \end{aligned}$$
(41)

The following facts are true.

  1. (1)

    \( X^\mu _{n,t}\) satisfies

    $$\begin{aligned} \langle \psi | X^\mu _{n,t} \psi \rangle = \langle \psi | N^\mu _{n,t} \psi \rangle \quad \forall \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,, \end{aligned}$$
    (42)

    where \( N^\mu _{n,t}\) is the Newton–Wigner position operator, so that the further following facts are valid.

    1. (a)

      The identity holds

      $$\begin{aligned} X^\mu _{n,t}= N^\mu _{n,t}|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})}\,. \end{aligned}$$
      (43)
    2. (b)

      \(X^\mu _{n,t}\) is symmetric, essentially self-adjoint and its unique self-adjoint extension is \( N^k_{n,t}\) itself.

    3. (c)

      The Heisenberg commutation relations hold, where \(k,h=1,2,3\):

      $$\begin{aligned} {[} X_{n,t}^k, X_{n,t}^h]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =[ P_{n h}, P_{n k}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} =0\,, \qquad [ X_{n,t}^k, P_{n h}]|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})} = i\delta ^k_hI|_{{{{\mathcal {S}}}}({{{\mathcal {H}}}})}\,. \end{aligned}$$
      (44)
    4. (d)

      The \(IO(1,3)_+\) covariance relations are true, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(IO(1,3)_+ \ni h= (\Lambda _h, a_h)\),

      $$\begin{aligned} U_h X_{n,t}^\alpha U_h^{-1} \psi = (\Lambda ^{-1}_h)^\alpha _\beta (X^{\beta }_{\Lambda _h n, t_{h}} - a_h^\beta I)\psi , \quad \forall h \in IO(1,3)_+\,. \end{aligned}$$
      (45)
    5. (d)

      The Heisenberg time evolution relation is validFootnote 8:

      $$\begin{aligned} U^{(n)\dagger }_t X_{n,0}^k U^{(n)}_t\psi = X_{n,t}^k\psi = X_{n,0}^k \psi + t\frac{P_{nk }}{P_{n0}}\psi \quad \text{ for } \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \text{ and } k=1,2,3\,. \nonumber \\ \end{aligned}$$
      (46)
    6. (e)

      If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(||\psi ||=1\), the first-moment operators define a timelike worldline because

      $$\begin{aligned} \sum _{k=1}^3 \left( \frac{d}{dt} \langle \psi | X^{k}_{n,t} \psi \rangle \right) ^2 < 1 \,. \end{aligned}$$
      (47)
  2. (2)

    If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) with \(||\psi ||=1\) and \(k=1,2,3\),

    $$\begin{aligned} \int _{\Sigma _{n,t}} (x^k)^2 d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle = \langle \psi | ( N^k_{n,t})^2\psi \rangle + \left\langle \psi \left| \frac{(P_{n0})^2-(P_{nk})^2}{2(P_{n0})^4}\psi \right. \right\rangle \,. \end{aligned}$$
    (48)

    As a consequence, a corrected version of the Heisenberg inequality holds for \(k=1,2,3\) (restoring the physical constants):

    $$\begin{aligned} \Delta _\psi X^k_{n,t} \Delta _\psi P_{nk} \ge \frac{\hbar }{2} \sqrt{1 + 2\Delta _\psi P_{n,k}^2 \left\langle \psi \left| \frac{(P_{n0})^2-(P_{nk})^2}{(P_{n0})^4}\psi \right. \right\rangle }\,, \quad \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,. \nonumber \\ \end{aligned}$$
    (49)

    where \(\Delta _\psi X^k_{n,t}\) is the standard deviation of the probability measure \({\mathscr {L}}(\Sigma _{n,t}) \ni \Delta \mapsto \langle \psi |{{\textsf{A}}}_{n,t}(\Delta )\psi \rangle \in [0,1]\).

Proof

It is clear that, if an operator \(X_{n,t}^\mu \) exists that satisfies (41), then it must be unique on its domain \(\mathcal{S}({{{\mathcal {H}}}})\). That is because, by polarization any other operator \(S: {{{\mathcal {S}}}}({{{\mathcal {H}}}}) \rightarrow {{{\mathcal {H}}}}\) that satisfies that identity would have the same matrix elements \(\langle \psi '| S\psi \rangle = \langle \psi '| X_{n,t}^\mu \psi \rangle \) when \(\psi ,\psi ' \in \mathcal{S}({{{\mathcal {H}}}})\). Since this space is dense, we have \( S\psi = X_{n,t}^\mu \psi \). To conclude the proof of the initial statement in (1), it is therefore sufficient to show that (42) is valid. Properties (a)–(e) are then obvious consequences of the analogs for \( N_{n,t}^\mu \) and of the fact that \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is also invariant under U, \( N_{n,t}^\beta \), and \(P_{n\alpha }\). The proof of (42), taking (38) into account, just amounts to prove that

$$\begin{aligned} \eta ^{\mu \nu } \int _{\Sigma _{t,n}} x^k d\left\langle \frac{P_{n\mu }}{H_n}\psi \left| {{\textsf{Q}}}_{n,t}(x) \frac{P_{n\nu }}{H_n} \right. \psi \right\rangle + \int _{\Sigma _{t,n}} x^k d \left\langle \frac{m}{H_n}\psi \left| {{\textsf{Q}}}_{n,t}(x) \frac{m}{H_n}\right. \psi \right\rangle =0\,, \end{aligned}$$

if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and \(k=1,2,3\). The case \(k=0\) is trivial since in that situation \(x^0=t\) can be extracted by the two integrals and the identity boils down to the trivial one \(\langle \psi | (H_n^{-1}(P_{n}^\mu P_{n\mu } + m^2 I) \psi \rangle =0\). Regarding the cases \(k=1,2,3\), taking advantage of the spectral decomposition of \( N^k_{n,t}\), the identity above can be rewritten

$$\begin{aligned} \eta ^{\mu \nu } \left\langle \frac{P_{n\mu }}{H_n}\psi \left| N^k_{n,t} \frac{P_{n\nu }}{H_n} \right. \psi \right\rangle + \left\langle \frac{m}{H_n}\psi \left| N^{k}_{n,t} \frac{m}{H_n}\right. \psi \right\rangle =0\,, \end{aligned}$$

where we have also used the fact that \({{{\mathcal {S}}}}({{{\mathcal {H}}}}) \subset D( N^k_{n,t})\) and the former space is invariant under the self-adjoint bounded operators \(H_n^{-1}\) and \(H_n^{-1}P_{n\mu }\) as the reader immediately proves. The identity above can be rearranged to the equivalent form (remember that \(H_n= -P_{n0}\))

$$\begin{aligned}{} & {} \eta ^{\mu \nu } \left\langle \psi \left| \frac{P_{n\mu }}{H_n} \frac{P_{n\nu }}{H_n} N^k_{n,t} \right. \psi \right\rangle + \left\langle \psi \left| \frac{m}{H_n} \frac{m}{H_n} N^{k}_{n,t}\right. \psi \right\rangle \\{} & {} \quad + \eta ^{\mu \nu } \left\langle \psi \left| \frac{P_{n\mu }}{H_n} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] \right. \psi \right\rangle + \left\langle \psi \left| \frac{m}{H_n} \left[ N^{k}_{n,t}, \frac{m}{H_n}\right] \right. \psi \right\rangle =0\,. \end{aligned}$$

Representing the identity above in the Hilbert space \(L^2({{\mathbb {R}}}^3, d^3p)\) where \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is represented by \({\mathscr {S}}({{\mathbb {R}}}^3)\) itself, \(P_{n\mu }= p_\mu \cdot \), \(H_n= E_n(p) \cdot \) are multiplicative and, for \(\psi \in {\mathscr {S}}({{\mathbb {R}}}^3)\) we have \( N^{k}_{n,t}\psi = i\frac{\partial }{\partial p_k}\psi \), we see that the two commutators are multiplicative operators as well. Therefore, for instance \( \frac{P_{n\mu }}{H_n} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] = \frac{1}{2} \frac{P_{n\mu }}{H_n} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] + \frac{1}{2} \left[ N^k_{n,t}, \frac{P_{n\nu }}{H_n}\right] \frac{P_{n\mu }}{H_n} = \frac{1}{2} \left[ N^k_{n,t}, \frac{P^2_{n\nu }}{H^2_n}\right] \) and similarly for the other addends. In summary, the identity we need to establish can be rearranged to

$$\begin{aligned} \left\langle \psi \left| \frac{ \eta ^{\mu \nu } P_{n\mu } P_{n\nu } +m^2I}{H^2_n} N^k_{n,t} \right. \psi \right\rangle + \frac{1}{2} \left\langle \psi \left| \left[ N^k, \frac{ \eta ^{\mu \nu } P_{n\mu } P_{n\nu } +m^2I}{H^2_n}\right] \right. \psi \right\rangle =0\,. \end{aligned}$$

which is evidently true, because \( \eta ^{\mu \nu } P_{n\mu } P_{n\nu } +m^2I=0\) on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\), and it complete the proof of (1).

Let us pass to (2) and we prove (48). With the same procedure used to prove (1) and if \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\), we find through (38)

$$\begin{aligned}{} & {} \int _{\Sigma _{n,t}} (x^k)^2 d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle = \langle \psi | ( N_{n,t})^2 \psi \rangle \\{} & {} \quad +\frac{1}{2} \eta ^{\mu \nu } \left\langle \psi \left| \frac{P_{n\mu }}{H_n} ( N^k_{n,t})^2 \frac{P_{n\nu }}{H_n} \right. \psi \right\rangle + \frac{1}{2}\left\langle \psi \left| \frac{m}{H_n} ( N^{k}_{n,t})^2 \frac{m}{H_n}\right. \psi \right\rangle \,. \end{aligned}$$

The second line can be re-arranged to

$$\begin{aligned}{} & {} \frac{1}{2}\eta ^{\mu \nu } \left\langle \psi \left| \frac{P_{n\mu }}{H_n}\frac{P_{n\nu }}{H_n} ( N^k_{n,t})^2 \right. \psi \right\rangle +\frac{1}{2} \left\langle \psi \left| \frac{m}{H_n}\frac{m}{H_n} ( N^{k}_{n,t})^2 \right. \psi \right\rangle \\{} & {} \quad +\frac{1}{2} \eta ^{\mu \nu } \left\langle \psi \left| \frac{P_{n\mu }}{H_n} \left[ ( N^k_{n,t})^2, \frac{P_{n\nu }}{H_n}\right] \right. \psi \right\rangle +\frac{1}{2} \left\langle \psi \left| \frac{m}{H_n} \left[ ( N^{k}_{n,t})^2, \frac{m}{H_n}\right] \right. \psi \right\rangle \,. \end{aligned}$$

The first line vanishes, while the second can be explicitly computed by working in the space \(L^2({{\mathbb {R}}}^3, d^3p)\) exactly as we did for item (1) and it becomes

$$\begin{aligned}{} & {} -\eta ^{\mu \nu }\frac{1}{2} \left\langle \psi \left| \frac{p_{\mu }}{p_0} \left[ \left( \frac{\partial }{\partial p_k}\right) ^2, \frac{p_{\nu }}{p_0}\right] \right. \psi \right\rangle -\frac{1}{2} \left\langle \psi \left| \frac{m}{p_0} \left[ \left( \frac{\partial }{\partial p_k}\right) ^2, \frac{m}{p_0}\right] \right. \psi \right\rangle \\{} & {} \quad =\left\langle \psi \left| \frac{1}{2p^0} \left( \partial _{p_k}\frac{p^k}{p^0} \right) \right. \psi \right\rangle = \left\langle \psi \left| \frac{H_n^2-P_k^2}{2H_n^4}\psi \right. \right\rangle \,, \end{aligned}$$

where \(p_0= -\sqrt{m^2 + \sum _{k=1}^3 p_k^2}\) and the operators \(p_\mu \) being multiplicative. The proof of (48) is over. To prove (49), observe that

$$\begin{aligned} (\Delta _\psi X^k_{n,t})^2{} & {} = \int _{\Sigma _{n,t}} (x^k)^2 d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle - \left( \int _{\Sigma _{n,t}} x^k d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle \right) ^2\\{} & {} = \langle \psi | ( N^\mu _{n,t})^2\psi \rangle + \left\langle \psi \left| \frac{H_n^2-P_k^2}{2H_n^4}\psi \right. \right\rangle -\langle \psi | N^k_{n,t}\psi \rangle ^2 \\{} & {} = (\Delta _\psi N^k_{n,t})^2 + \left\langle \psi \left| \frac{H_n^2-P_k^2}{2H_n^4}\psi \right. \right\rangle \,. \end{aligned}$$

By multiplying both sides with \((\Delta _\psi P_k)^2\) and taking advantage of the standard Heisenberg inequality, we get (49). \(\square \)

Remark 27

  1. (1)

    The first-moment operator can be formally written within the QFT setting of Sect. 5.1,

    $$\begin{aligned} X^k_{n,0} = \frac{1}{\sqrt{H_n}} P_1 \int _{\Sigma _{n,0}} x^k:{\hat{T}}_{\mu \nu }:(x) n^\mu n^\nu \, d \Sigma _{nt}(x) P_1 \frac{1}{\sqrt{H_n}}\,. \end{aligned}$$

    The internal integral is nothing but the k-component of the boost generator in QFT evaluated at \(t=0\). The position operator obtained in that way coincides with the known Born-Infeld position operator as discussed in [3] and remarked in [36].

  2. (2)

    Item (2) is of mathematical interest. If the identity were

    $$\begin{aligned} \int _{\Sigma _{nt}} (x^k)^2 d\langle \psi | {{\textsf{A}}}_{n,t}(x) \psi \rangle = \langle \psi | (X^k_{n,t})^2\psi \rangle \,, \end{aligned}$$

    since \(X^k_{n,t}\) is symmetric and (41) is true, one could apply a known theorem by Naimark about the decomposition of symmetric operators in terms of POVMs (see Theorem 23 in [11] and the discussion about it). On account of that theorem, the POVM that decomposes \(X^k_{n,t}\) according to (41) would be uniquely determined by its first moment \(X^k_{n,t}\), provided this operator be maximally symmetric on its domain, and it is our case since \(X^k_{n,t}\) is essentially self-adjoint. Along this argument one would conclude that \({{\textsf{A}}}_{nt}= {{\textsf{Q}}}_{nt}\), since the latter POVM (actually a PVM) decomposes \(\overline{X^k_{n,t}} = N^k_{n,t}\) (as in (41) on \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\)) in view of the spectral theorem. In summary, the cumbersome addend to the right-hand side of (48) is responsible for the failure of \({{\textsf{A}}}_{nt}= {{\textsf{Q}}}_{nt}\).

  3. (3)

    Given a pure state represented by a unit vector \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), also the standard Heisenberg inequalities

    $$\begin{aligned} \Delta _\psi N^k_{n,t} \Delta _\psi P_{nk} \ge \hbar /2\,, \end{aligned}$$

    are valid for \( N_{n,t}^k\) and \(P_{nk}\) in addition to (49), as a consequence of the canonical commutation relations (44). The point is that these relations refer to the physically wrong probability distribution, the one constructed out of the Newton–Wigner PVM \({{\textsf{Q}}}_{n,t}\) instead of the Terno POVM \({{\textsf{A}}}_{n,t}\). \(\blacksquare \)

6 Every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution for \({{\textsf{A}}}\)

This section is devoted to prove that every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution in Castrigiano’s sense, according to (a) in Definition 15, for every family \(\mu ^\psi \) constructed out of the POVMs \({{\textsf{A}}}\) and a pure state \(\psi \in {{{\mathcal {H}}}}\): \(\mu ^{\psi }_{n,t}(\Delta ):= \langle \psi |{{\textsf{A}}}_{n,t}(\Delta )\psi \rangle \).

Remark 28

There are other notions of spatial localization which are causal with respect to time evolution. The localization in terms of POVMs due to Petzold et al. [17, 18] and Henning, Wolf [23] are causal with respect to time evolution. The proof in [18] can be made rigorous by means of the mathematical approach developed in this section. \(\blacksquare \)

6.1 The heuristic idea of a conserved probability four-current

The technology I will exploit to prove that \({{\textsf{A}}}_{n,t}\) produces a family of probability measures that satisfies the requirement (a) in Definition 15 for every \(n\in {{\textsf{T}}}_+\) is based on a probability four-current associated to \(\langle \psi |{{\textsf{A}}}_{n,t}(\Delta )\psi \rangle \). As explicitly observed in [36], (I disregard here a number of mathematical details which will be fixed later)

$$\begin{aligned} \int _{\Delta } d\langle \psi |{{\textsf{A}}}_{n,t}(x)\psi \rangle = \int _{\Delta } J^{\psi }_{n\mu }(x) n^\mu \textrm{d}\Sigma _{n,t}(x)\,, \end{aligned}$$

where \(J^{\psi }_{n}\) satisfies a conservation equation \(\partial ^\mu J^{\psi }_{n \mu } =0\). The existence of such four current of probability was postulated in the general case in [25] and see also [17, 18, 23, 26] for the use of similar currents in relation to the causality problem for massive Klein Gordon particles. A similar current exists for Dirac and Weyl particles [8, 9]. Assuming that \(J^{\psi }_{n}\) is causal, the divergence theorem should imply the validity of the local-causality requirement when restricting to the family of t-parametrized rest spaces of a unique reference frame. I will prove that it is the case in full generality, referring to every Lebesgue set \(\Delta \). The extension to the full family of reference frames, i.e., the proof of the validity of (b) in Definition 15, is not so easy since \(J^{\psi }_{n}(x)\) itself depends on n and one has to compare \(\int _{\Delta } J^{\psi }_{n\mu }(x) n^\mu \textrm{d}\Sigma _{n,t}(x)\) and \(\int _{\Delta } J^{\psi }_{n'\mu }(x) {n'}^\mu \textrm{d}\Sigma _{n',t'}(x)\).

6.2 The probability current of the stress-energy tensor

The first step of the proof consists of explicitly writing down the current \(J^\psi _n\) [36] for the special case \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). As usual, I represent events by means of four-vectors \({{\mathbb {M}}}\ni e= o+ x(e)\) where \(\psi \in {\textsf{V}}\).

Directly from (36), one has that, if \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\)

$$\begin{aligned} \langle \psi |A_{n,t}(\Delta )\psi \rangle = \int _{\Delta } T^{\psi }_{\mu \nu }(x)_n n^\mu n^\nu d \Sigma _{n,t}(x)\,, \end{aligned}$$
(50)

where I introduced the coordinate representation of the stress-energy tensor of \(\Phi ^\psi _n\),

$$\begin{aligned} T^{\psi }_{\mu \nu }(x)_n{} & {} := \frac{1}{2}\left( \partial _\mu \overline{\Phi ^\psi _n(x)}\partial _\nu \Phi ^\psi _n(x) +\partial _\mu \Phi ^\psi _n(x)\partial _\nu \overline{\Phi ^\psi _n(x)}\right) \nonumber \\{} & {} \quad - \frac{1}{2}\eta _{\mu \nu } \left( \partial ^\alpha \overline{\Phi ^\psi _n(x)} \partial _\alpha \Phi ^\psi _n(x) + m^2 \overline{\Phi ^\psi _n(x)} \Phi ^\psi _n(x) \right) \,, \end{aligned}$$
(51)

associated to the smooth complex Klein–Gordon field

$$\begin{aligned} \Phi ^\psi _n(x):= \int _{{\textsf{V}}_{m,+}} \frac{\psi (p)e^{i p\cdot x} }{(2\pi )^{3/2}\sqrt{E_n(p)}} \textrm{d}\mu _m(p)\,, \end{aligned}$$
(52)

Notice the further factor \(E^{-1/2}_n(p)\) when comparing with (12) which arises from the analogous factors in the right-hand side of (35). Let us fix a Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\) comoving with some \(n\in {{\textsf{T}}}_+\). Since the factor of \(e^{i p\cdot x} \) in the integrand stays in \({\mathscr {S}}({{\mathbb {R}}}^3)\), the function \({{\mathbb {R}}}^3 \ni \vec {x} \mapsto \Phi ^\psi _n(t,\vec {x})\) belongs to \({\mathscr {S}}({{\mathbb {R}}}^3)\) as well for every \(t\in {{\mathbb {R}}}\).

Definition 29

If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), \(||\psi ||=1\) and \(n\in {{\textsf{T}}}_+\), the associated probability four-current of \({{\textsf{A}}}\) is the contravariant vector field \(J^\psi _n\) on \({{\mathbb {M}}}\) written in coordinates reads

$$\begin{aligned} J^{\psi \mu }_{n}(x):= n^\nu T^{\psi \mu }_{\nu }(x)_n \,, \end{aligned}$$
(53)

where \((T^\psi _{\nu \mu })_n\) is defined in (51).

It is evident that, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), \(n\in {{\textsf{T}}}_+\), \(t\in {{\mathbb {R}}}\), and \(\Delta \in {\mathscr {L}}(\Sigma _{n,t})\), (50) yields

$$\begin{aligned} \langle \psi |A_{n,t}(\Delta ) \psi \rangle = \int _{\Delta } J^\psi _{n\mu }(x) n^\mu \textrm{d}\Sigma _{n,t}(x) \,. \end{aligned}$$
(54)

Proposition 30

If \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), \(n\in {{\textsf{T}}}_+\), then \(J^{\psi }_{n}\) is either the zero vector or is causal and past-directed. More precisely:

  1. (1)

    there is an open dense set \({{\textsf{O}}}^\psi _{n}\subset {{\mathbb {M}}}\) where \(J^\psi _{n}\) is timelike and past-directed;

  2. (2)

    if \(e\in {{\mathbb {M}}}\setminus {{\textsf{O}}}^\psi _n\), then either \(J^\psi _{n}(e)=0\) or \(J^\psi _{n}(e)\) is lightlike and past-directed;

  3. (3)

    it holds \({{\textsf{O}}}^\psi _n =\{e \in {{\mathbb {M}}}\,|\; \Phi ^\psi _n(e) \ne 0\}\).

Proof

We need some preparatory identities and inequalities. Consider a Minkowskian coordinate system co-moving with n, so that \(n^\mu = \delta ^\mu _0\) and, if \(\Phi ^\psi _n = A_1+iA_2\) with \(A_i\) real, One can write

$$\begin{aligned} J^\psi _{n\mu } = J^\psi _{1n\mu }+J^\psi _{2n\mu } \end{aligned}$$

where, for \(j=1,2\),

$$\begin{aligned} J^\psi _{jn0}= & {} \frac{1}{2}\left( \partial _0A_j\partial _0 A_j + \sum _{k=1}^3\partial _kA_j\partial _k A_j + m^2 A_j^2\right) \,, \quad J^\psi _{jnh} \\= & {} \partial _0 A_j\partial _h A_j \,, \quad h=1,2,3. \end{aligned}$$

At this juncture observe that, for \(j=1,2\),

$$\begin{aligned} -g(J^\psi _{jn}, J^\psi _{jn}){} & {} = \frac{1}{4}\left( (\partial _0A_j)^2 + \sum _{k=1}^3(\partial _kA_j)^2 + m^2 A_j^2\right) ^2 - \sum _{k=1}^3(\partial _k A_j \partial _0 A_j)^2\nonumber \\{} & {} \quad = \frac{1}{4}\left( (\partial _0A_j)^2 - \sum _{k=1}^3(\partial _kA_j)^2 \right) ^2 + \frac{1}{4} m^4A_j^4\nonumber \\{} & {} \qquad + \frac{1}{2} m^2 (\partial _0 A_j)^2 A_j^2 + \frac{1}{2} m^2A_j^2 \sum _{k=1}^3(\partial _kA_j)^2 \ge 0\,. \end{aligned}$$
(55)

Let us pass to prove (1). Define \({{\textsf{O}}}^{\psi }_n\) as the set of events where \(J^\psi _{n\mu }\) is timelike. Let us prove that the set \({{\textsf{O}}}^{\psi }_n\) is dense and open and the vectors in it are past-directed.

(Dense.) It is clear from the found inequality that, in particular, if \(\Phi ^\psi _n(e) \ne 0\) then \(J^\psi _{n\mu } = J^\psi _{1n\mu }+J^\psi _{2n\mu }\) is timelike so that \(e\in {{\textsf{O}}}^{\psi }_n\). If \(x\in {{\mathbb {M}}}\) and \(N \ni x\) is an open neighborhood of it, suppose that there is no \(e\in N\) where \(\Phi ^\psi _n(e)\ne 0\). In particular, \(\Phi ^\psi _n(e)= 0\) in the open spatial set \(\Sigma _{n,t(x)}\cap N\). As a consequence, the spatial derivatives of \(\Phi ^\psi _n\) also vanishes on \(\Sigma _{n,t(x)}\cap N\) and (55) produces \(-g(J^\psi _{jn}, J^\psi _{jn})= \frac{1}{4}(\partial _t A_j(e) )^4\). If the right-hand side vanished for all \(e\in \Sigma _{n,t(x)}\cap N\) and \(j=1,2\), we would have that \(\Phi ^\psi _n(t,\cdot )\) and \((\overline{-\Delta +m^2})^{1/2}\Phi ^\psi _n(t,\cdot ) = -i \partial _t \Phi ^\psi _n(t,\cdot )=0\) on that open set in \(\Sigma _{n,t(x)}\). On account of Theorem 11, we would have \(\Phi ^\psi _n(t,\cdot )=0\) and thus \(\psi =0\) by inverting (52) and this is not allowed by hypothesis. We conclude that either \(\Phi ^\psi _n(t,e) \ne 0\) for some \(e\in \Sigma _{n,t(x)}\cap N\) or \(\Phi ^\psi _n(t,e) = 0\) for all \(e\in \Sigma _{n,t(x)}\cap N\), but \(\partial _t\Phi ^\psi _n(t,e) \ne 0\) for some \(e\in \Sigma _{n,t(x)}\cap N\). In both cases, (55) implies that \(J^\psi _{n}\) is timelike somewhere in the neighborhood N of x. We have proved that the set \({{\textsf{O}}}^\psi _n\) where \(J^\psi _n\) is timelike is dense.

(Open.) \({{\textsf{O}}}^\psi _n\) is also the preimage of an open set (the open future cone) according to a continuous map and thus it is open as well.

(Past directed.) Since n is future-directed and \(J^\psi _{jn} \cdot n = J^\psi _{jn0} \ge 0\), we also have that \(J^\psi _n\) is past-directed when it does not vanish.

(2) Consider \(e\in {{\mathbb {M}}}\setminus {{\textsf{O}}}^\psi _n\), namely \(J^\psi _n(e)\) is not timelike. Since \(J^\psi _{n} = J^\psi _{1n}+J^\psi _{2n}\) we have

$$\begin{aligned} g(J^\psi _{n},J^\psi _{n})= g(J^\psi _{1n},J^\psi _{1n}) + g(J^\psi _{2n},J^\psi _{2n}) + 2g(J^\psi _{1n},J^\psi _{2n})\,. \end{aligned}$$

Notice that all scalar products taking place on the right-hand side above are non-positive: the first two because of (55) and the last one because the two vectors are the limit of past directed timelike vectors for (1). Since the left-hand side is zero by hypothesis, we have the following two possibilities. \(J^\psi _n(e)\) vanishes (if both \(J^\psi _{1n}\) and \(J^\psi _{2n}\) vanish) or it is light like (if one of the two vanishes and the other is lightlike or if both are lightlike and parallel). In all these cases both \(A_1\) and \(A_2\) vanish on account of (55) where \(m>0\), so that \(\Phi ^\psi _n(e)=0\) as well. To conclude, observe that if \(J^\psi _n\) is lightlike, then it must be past-directed by continuity because \({{\textsf{O}}}^\psi _n\) is dense and the vectors in that set are past-directed. The proof of (3) has been given while establishing (1) and (2). \(\square \)

6.3 Every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution for \({{\textsf{A}}}\)

First of all, observe that if \(D\subset \Sigma _{n,t_1}\) is an open ball, then \(J^\pm (D)\) are open as well as it arises per direct inspection. This immediately implies that \(J^\pm (\Delta _1)\) are open if \(\Delta _1 \subset \Sigma _{n,t_1}\) is open and non-empty. As a consequence, when \(\Delta _1 \subset \Sigma _{n,t_1}\) is open, the intersections \(J^\pm (\Delta _1) \cap \Sigma _{n',t'}\) are open as well in the relative topology. I will use this fact several times in the rest of the paper.

Lemma 31

Consider the spatial localization observable \({{\textsf{A}}}\). Take \(n\in {{\textsf{T}}}_+\) and \(t_1,t_2 \in {{\mathbb {R}}}\) with \(t_2 \ne t_1\). Let \(\Delta _1 \subset \Sigma _{n,t_1}\) be a finite union of non-empty open balls with finite radius, and let \(\Delta _2:= (J^+(\Delta _1) \cup J^-(\Delta _1))\cap \Sigma _{n,t_2}\) be the corresponding open set in \(\Sigma _{n,t_2}\). Then

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \end{aligned}$$
(56)

is valid for every \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) with \(||\psi ||=1\).

Proof

As a first case, we assume that \(\Delta _1 \subset \Sigma _{n,t_1}\) is an open ball of finite radius, so that \(\Delta _2\) in \(\Sigma _{n,t_2}\) is an analogous open set in \(\Sigma _{n,t_2}\). Let us suppose \(t_2>t_1\) (the other case is analogous) and consider \(B \subset {{\mathbb {M}}}\) whose boundary is made of the two bases \(\Delta _1\), \(\Delta _2\), and the portion L of \(\partial J^+(\Delta _1)\) between them. B is a manifold with boundary and we can use the Stokes–Poincaré theorem for the 3-formsFootnote 9

$$\begin{aligned} \nu ^\psi _n = -\frac{1}{3!} \sqrt{-\det (g)} \epsilon _{\alpha \beta \gamma \delta } J^{\psi \delta }_n \textrm{d}x^\alpha \wedge \textrm{d}x^\beta \wedge \textrm{d}x^\gamma \end{aligned}$$

associated to the current \(J^\psi _n\) for the considered \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). We have chosen a Minkowskian coordinate system \(t=x^0,x^1,x^2,x^3\) comoving with n to write down the components of \(\nu ^\psi _n\) as above. With the choices above, the integral of the form on \(\Delta _{t_2}\) gives

$$\begin{aligned} \int _{\Delta _{2}} \nu ^\psi _n = \int _{\Delta _{2}} J^\psi _n \cdot n \textrm{d}\Sigma _{n,t_2}= \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \,. \end{aligned}$$

Since \(J^\psi _n\) is conserved, the integral of \(\nu ^\psi _n\) on B vanishes, so that,

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle - \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _2) \psi \rangle = \int _L \nu ^\psi _n\,. \end{aligned}$$
(57)

To compute the integral we change coordinates and we pass to a system of lightlike and polar coordinates \(u,v, \theta , \phi \) where \(r, \theta ,\phi \) are standard polar spherical coordinates in \(\Sigma _{n,t_1}\) with center given by the center of \(\Delta _1\) and \(u:= t+r\), \(v:= t-r\) so that u is a lightlike future increasing coordinate along L. With these coordinates,

$$\begin{aligned} g = -\frac{1}{2}\textrm{d}u \otimes \textrm{d}v - \frac{1}{2} \textrm{d}v\otimes \textrm{d}v + \frac{1}{4}(u-v)^2 (\sin ^2 \theta \textrm{d}\phi \otimes \textrm{d}\phi + \textrm{d}\theta \otimes \textrm{d}\theta ) \end{aligned}$$

and, writing J for \(J_n^\psi \),

$$\begin{aligned} \nu ^\psi _n = -\frac{1}{2} (u-v)^2 \sin \theta J^v \textrm{d}u \wedge \textrm{d}\theta \wedge \textrm{d}\phi \,. \end{aligned}$$
(58)

Now, observe that, since \(J^\psi _n\) is past directed (if it does not vanish), we must have \(2J^t = J^u+ J^v \le 0\). The condition that \(J^\psi _n\) is zero or causal reads

$$\begin{aligned} -J^uJ^v + h(\vec {J}, \vec {J}) \le 0\,, \end{aligned}$$

where h is the Euclidean metric on \(\Sigma _{n,t}\) and \(\vec {J}\) the spatial part of \(J_n^\psi \). In summary, \(J^uJ^v \ge 0\) and \(J^u+J^v \le 0\), so that \(J^v,J^u \le 0\). Since \(\theta \in [0,\pi ]\) in (58) and \(v=0\) on L, we conclude that

$$\begin{aligned} \int _L \nu ^\psi _n = -\int _L \frac{1}{2}u^2 \sin \theta J^v \textrm{d}u \wedge \textrm{d}\theta \wedge \textrm{d}\phi \ge 0\,. \end{aligned}$$
(59)

Up to now we have established that

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \,. \end{aligned}$$
(60)

To conclude the proof, it is sufficient to observe what follows in the case \(\Delta _1\) is a finite union of finite-radius open balls \(\Delta ^{(j)}_{1}\), \(j=1,\ldots , N\). We can always assume that no ball of the family is a subset of another ball of the family. Since N is finite, the region of \(\partial J^+(\Delta _1)\) between \(t_1\) and \(t_2\) is a piecewise smooth lightlike submanifold and we can apply the above reasoning by changing coordinates for every cone of the family. The integral over the surface \(\partial J^+(\Delta _1)\) between \(t_1\) and \(t_2\) is a finite sum of contributions of type (59) where each integral is now performed on a smaller portion of each conical surface. However, each contribution is nonnegative because the integrated function is nonnegative. \(\square \)

Remark 32

Even if it is not strictly necessary for our final goal, I prove that, if restricting to a suitable dense subspace of \({{{\mathcal {S}}}}(\mathcal{H})\), the inequality in (56) can be made sharp. I consider a subspace \({{{\mathcal {D}}}}({{{\mathcal {H}}}}) \subset {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) of vectors \(\psi \in {{\mathcal {H}}}\) such that there is \(n\in {{\textsf{T}}}_+\) and a Minkowski coordinate system co-moving with n such that \({{\mathbb {R}}}^3 \ni \vec {p} \mapsto \psi (E_n(p), \vec {p}_n) \in {{\mathscr {D}}}({{\mathbb {R}}}^3)\) (the test-function space on \({{\mathbb {R}}}^3\)) when represented in the spatial coordinates on \({{\mathbb {R}}}^3\). The definition of \({{{\mathcal {D}}}}({{{\mathcal {H}}}})\) does not depend of the choice of n and co-moving Minkowskian coordinates as \({{{\mathcal {D}}}}({{{\mathcal {H}}}})\) is invariant under the representation U of \(IO(1,3)_+\) in (7). Finally, \(\mathcal{D}({{{\mathcal {H}}}}) \subset {{{\mathcal {S}}}}({{{\mathcal {D}}}})\) is dense in \({{{\mathcal {H}}}}\). The proof of these elementary facts is analogous to the one of \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\) and it is left to the reader.

Relying on the well-posedness of the Characteristic Cauchy problem on Lorentzian cones, the following precise result is valid.

Proposition 33

With the hypotheses of Lemma 31, if \(\psi \in \mathcal{D}({{{\mathcal {H}}}})\) with \(||\psi ||=1\), then inequality (56) holds in the sharpest form

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle < \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \end{aligned}$$
(61)

Proof

See “Appendix A”. \(\square \)

\(\blacksquare \)

I come back to the main stream of the reasoning with a second lemma.

Lemma 34

Consider the spatial localization observable \({{\textsf{A}}}\). Take \(n\in {{\textsf{T}}}_+\) and \(t_1,t_2 \in {{\mathbb {R}}}\) with \(t_2 \ne t_1\). Let \(\Delta _1 \subset \Sigma _{n,t_1}\) be an non-empty open set (respectively, a compact set), and let \(\Delta _2:= (J^+(\Delta _1) \cup J^-(\Delta _1))\cap \Sigma _{n,t_2}\) be the corresponding open (resp. compact) set in \(\Sigma _{n,t_2}\). Then

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \end{aligned}$$
(62)

is valid for every \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) with \(||\psi ||=1\).

Proof

We always assume \(t_2> t_1\), since the other case has a similar proof. First of all, we already know that if \(\Delta _1\) is open then \(\Delta _2\) is open as well. The case of \(\Delta _1\) compact is a subcase of a known fact valid in globally hyperbolic spacetimes (like \({{\mathbb {M}}}\)): if K is compact, the intersection of \(J^+(K)\) and a spacelike Cauchy surface (like \(\Sigma _{n,t_2}\)) is compact as well.

Let us first examine the case of \(\Delta _1\subset \Sigma _{n,t_1}\) open. According to Theorem 1.26 in [12], for every \(\delta >0\), there exist a countable collection \(\{\Gamma _j\}_{j=1,2,\ldots }\) of disjoint (non-empty) closed balls \(\Gamma _j \subset \Delta _1\) with diameter less than \(\delta \), such that

$$\begin{aligned} \int _{\Delta _1 \setminus \bigcup _{j\in {{\mathbb {N}}}} \Gamma _j} 1\, \textrm{d}\Sigma _{n,1} =0 \end{aligned}$$
(63)

where we remind the reader that \(\textrm{d}\Sigma _{n,1}\) is the Lebesgue measure when written in the spatial Minkowskian coordinates comoving with n. Evidently we can assume that the balls are open (and their closures are disjoint) since \(\partial \Gamma _j\) has zero Lebesgue measure. Let us define \(\Delta _1':= \bigcup _{j\in {{\mathbb {N}}}} \Gamma _j\) and \(\Delta '_2:= \Sigma _{n,t_2} \cap J^+(\Delta '_1)\). Since the probability measure defined by \({{\textsf{A}}}_{n,t_1}\) and \(\psi \) is per definition absolutely continuous with respect to the Lebesgue measure, (63) yields \(\langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta '_1)\psi \rangle = \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1)\psi \rangle \in [0,+\infty ]\). Furthermore, since \(\Delta '_1 \subset \Delta _1\), it must be \(\Delta _2' \subset \Delta _2\) and thus \(\langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta '_2)\psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2)\psi \rangle \). In summary, to prove the thesis, it is sufficient to establish that \(\langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta '_1)\psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta '_2)\psi \rangle \). Let us define \(\Delta _1^N:= \cup _{j=1}^N \Gamma _j\) and \(\Delta _2^N:= J^{+}(\Delta _1^N) \cap \Sigma _{n,t_2}\). By additivity and taking Lemma 31 into account,

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta '_1)\psi \rangle= & {} \lim _{N\rightarrow +\infty } \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1^N)\psi \rangle \le \lim _{N\rightarrow +\infty } \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2^N)\psi \rangle \\\le & {} \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta '_2)\psi \rangle \,. \end{aligned}$$

Notice that the limit of the right-most side exists because the sequence is non-decreasing as \(\Delta _2^N \subset \Delta _2^{N+1} \subset \Delta _2'\) by construction.

Let us pass to prove the thesis for \(\Delta _1\) compact. Since \(\Sigma _{n,t_1}\) is a metric space and \(\Delta _1\) compact, it is not difficult to construct a sequence of open sets \(A_1 \supset A_2 \supset \cdots \supset \Delta _1\) such that

$$\begin{aligned} \Delta _1 = \bigcap _{j=1,2,\ldots } A_j\,. \end{aligned}$$

Each \(A_j\) is the union of a finite (but arbitrarily large) number of balls centered on some points of \(\Delta _1\) with radius less than \(\delta _j \rightarrow 0^+\). As a consequence

$$\begin{aligned} \Delta _2 = \left( \bigcap _{j=1,2, \ldots } J^+(A_j)\right) \cap \Sigma _{n,t_2}\,. \end{aligned}$$

The inclusion \(\subset \) immediately arises from the definitions, the other inclusion is less trivial. Let us prove it. If e belongs to the right-hand side of the identity above and, as said, \(A_j\) is the finite union of balls of radius \(\delta _j>0\) centered on some points of \(\Delta _1\), we have thatFootnote 10\(\text{ dist }(e, J^+(\Delta _1) \cap \Sigma _{n,t_2}) < \delta _j\) for every \(\delta _j \rightarrow 0^+\). As a consequence e is an accumulation point of \(\Delta _2 = (J^+(\Delta _1) \cap \Sigma _{n,t_2}) \cap \Sigma _{n,t_2}\) which is compact, thus closed (the space being Hausdorff). Hence \(e \in \Delta _2\). Finally, taking advantage of the already proved result on open sets and internal continuity

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle= & {} \inf _j \langle \psi | {{\textsf{A}}}_{n,t_2}(J^+(A_j) \cap \Sigma _{n,t_2} ) \psi \rangle \ge \inf _j \langle \psi | {{\textsf{A}}}_{n,t_1}(A_j) \psi \rangle \\ {}= & {} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle \,. \end{aligned}$$

\(\square \)

I am now in a position to prove the main result of this section, that every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution (according to (a) in Definition 15) for every spatial localization probability measure constructed out of the Terno POVM \({{\textsf{A}}}\) and every pure state \(\psi \in {{{\mathcal {H}}}}\).

Theorem 35

Consider the spatial localization observable \({{\textsf{A}}}\). Take \(n\in {{\textsf{T}}}_+\) and \(t_1,t_2 \in {{\mathbb {R}}}\). Let \(\Delta _1 \subset \Sigma _{n,t_1}\) be a Lebesgue set and let \(\Delta _2:= (J^+(\Delta _1) \cup J^-(\Delta _1))\cap \Sigma _{n,t_2}\) be the corresponding set in \(\Sigma _{n,t_2}\). Then

$$\begin{aligned} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle \le \langle \psi | {{\textsf{A}}}_{n,t_2}(\Delta _2) \psi \rangle \,, \quad \forall \psi \in {{{\mathcal {H}}}} \text{ with } ||\psi ||=1. \end{aligned}$$
(64)

In other words, every \(n\in {{\textsf{T}}}_+\) defines a causal time evolution according to (a) in Definition 15 for the family of spatial localization probability measures \(\mu ^\psi (\cdot ):= \langle \psi | {{\textsf{A}}}(\cdot ) \psi \rangle \).

Proof

First of all, notice that \(\mu ^{\psi }_{n,t}(\cdot ):= \langle \psi | {{\textsf{A}}}_{n,t}(\cdot ) \psi \rangle \), for \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\) is necessarily regular when restricted to \({\mathscr {B}}(\Sigma _{n,t})\), since \(\Sigma _{n,t}\) is countable union of compacts with finite measure (Theorem 2.18 in [33]). As a consequence the completion \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{n,t})}}\) of \(\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}\) is regular as well (Prop. 1.59 in [10]). The \(\sigma \)-algebra of the regular complete measure \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}}\) includes the Lebesgue \(\sigma \)-algebra in particular, and the completion \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}}\) restricted to \({\mathscr {L}}(\Sigma _{n,t})\) coincides to \(\mu ^{\psi }_{n,t}\) itself. This can be seen as follows. The \(\sigma \)-algebra of a completion \({\overline{\mu }}\)—where \(\mu : {\mathscr {S}}(X) \rightarrow [0,+\infty ]\) is a positive \(\sigma \)-additive measure—can be constructed as the family of sets \(E\cup Z\) where \(E\in {\mathscr {S}}(X)\) and \(Z\subset F \in {\mathscr {S}}(X)\) with \(\mu (F)=0\). Obviously \({\overline{\mu }}(E\cup Z):= \mu (E)\). From these properties we can write, \(\overline{\mu ^{\psi }_{n,t}|_{{\mathscr {B}}(\Sigma _{nt})}}(G)=\mu ^{\psi }_{n,t}(G)\) if \(G\subset {\mathscr {L}}(\Sigma _{n,t})\) since \(G= E\cup Z\) where \(E\in {\mathscr {B}}(\Sigma _{n,t})\) and \(Z \subset F \in {\mathscr {B}}(\Sigma _{n,t})\) such that F has zero Lebesgue measure and thus \(\mu ^{\psi }_{n,t}(F)=0\) because \(\mu ^{\psi }_{n,t}\) is absolutely continuous with respect to the Lebesgue measure. We conclude that \(\mu ^{\psi }_{n,t}\) is regular on the Lebesgue \(\sigma \)-algebra because it is the restriction of a regular measure. In particular, it is inner regular. So, if \(\Delta _1\) is Lebesgue-measurable, for \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\) we can take advantage of Lemma 34 proving that

$$\begin{aligned}{} & {} \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _1) \psi \rangle = \sup \{\langle \psi | {{\textsf{A}}}_{n,t_1}(K) \psi \rangle \,|\, K\subset \Delta _1\,, K \text{ compact } \}\\{} & {} \quad \le \sup \{\langle \psi | {{\textsf{A}}}_{n,t_2}(J^+(K) \cap \Sigma _{n,2}) \psi \rangle \,|\, K\subset \Delta _1\,, K \text{ compact } \} \le \langle \psi | {{\textsf{A}}}_{n,t_1}(\Delta _2) \psi \rangle \,, \end{aligned}$$

where we have also used the fact that \(J^+(K) \cap \Sigma _{n,2} \subset J^+(\Delta _1) \cap \Sigma _{n, 2} = \Delta _2\).

The thesis is therefore true if \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\) with \(||\psi ||=1\). Evidently the last requirement can be dropped by bi-linearity of the scalar product. Since \({{{\mathcal {S}}}}(\mathcal{H})\) is dense in \({{{\mathcal {H}}}}\) and the scalar product is continuous, the result extends to the whole Hilbert space and the proof is over. \(\square \)

Corollary 36

There is no state \(\psi \in {{{\mathcal {H}}}}\) that satisfies the hypotheses of the Hegerfeldt theorem (Theorem 19) for any family of bounded balls in the rest space of any arbitrarily fixed \(n\in {{\textsf{T}}}_+ \).

Proof

The thesis of Hegerfeldt’s theorem is incompatible with the result of the previous theorem. \(\square \)

7 Subtleties with the notion of position and Castrigiano’s causality requirement

There is a crucial feature of the notion of spatial position by Terno: it uses a four current of probability that, in spite of being a four-vector, depends on the reference frame n as it is evident in (53) when \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). That is an unavoidable fact since the notion of energy-momentum current has the same type of dependence: \(J^\nu _n = n^\mu {T_{\mu }}^\nu \). This feature leads to a more articulated picture where one can define the probability to find a particle in \(\Delta \subset \Sigma _{n',t'}\) still referring to the current associated to \(n\ne n'\)! That is permitted because

$$\begin{aligned} J^{\psi \mu }_{n}(x) n'_\mu \ge 0 \end{aligned}$$

in view of Proposition 30, when \(n'\in {{\textsf{T}}}_+\). In fact \(J^{\psi \mu }_{n}(x)\) is causal and past directed or vanishes producing the inequality above just because \(n'\) is timelike and future directed. So that, if \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\), one can define a spatial localization probability

$$\begin{aligned} \mu ^{\psi ,n}_{n',t'}(\Delta ):= \int _{\Delta } J^{\psi \mu }_{n}(x) n'_\mu \textrm{d}\Sigma _{n',t'}\,,\quad \Delta \in {\mathscr {L}}(\Sigma _{n',t'})\,, \quad -n'\cdot x = t'\,. \end{aligned}$$
(65)

The divergence theorem, exploiting the fact that \( J^{\psi \mu }_{n}(x)\) rapidly vanishes at spatial infinity and that \(\partial _\mu J^{\psi \mu }_{n}(x)= \partial _\mu n^\nu T^{\psi \mu }_{\nu }(x)_n=0\), assures the correct normalization

$$\begin{aligned} \mu ^{\psi ,n}_{n',t'}(\Sigma _{n',t'}):= & {} \int _{\Sigma _{n',t'}} J^{\psi \mu }_{n}(x) n'_\mu \textrm{d}\Sigma _{n',t'} = \int _{\Sigma _{n,t}} J^{\psi \mu }_{n}(x) n_\mu \textrm{d}\Sigma _{n,t}\\= & {} \langle \psi |{{\textsf{A}}}_{n',t'}(\Sigma _{n,t})\psi \rangle = 1\,. \end{aligned}$$

Physically speaking, \(\mu ^{\psi ,n}_{n',t'}(\Delta )\) accounts for the probability to find a particle in \(\Delta \subset \Sigma _{n',t'}\) using detectors which are at rest in n but synchronized with \(n'\). There is no reason why this probability should coincide with \(\mu ^\psi _{n',t'}(\Delta )=\langle \psi |{{\textsf{A}}}_{n',t'}(\Delta )\psi \rangle \) as the corresponding energy densities do not. This result opens a new perspective on the notion of spatial localization which deserves to be investigated.

Mathematically speaking, all that can be encapsulated into a new family of POVMs depending on both n and \(n'\) (and \(t'\)).

Theorem 37

If \(n,n' \in {{\textsf{T}}}_+\) and \(t'\in {{\mathbb {R}}}\), there is only one POVM with effects \({{\textsf{M}}}^n_{n',t}(\Delta ) \in {{\mathfrak {B}}}({{{\mathcal {H}}}}) \) for \(\Delta \in {\mathscr {L}}(\Sigma _{n',t'})\) such that

$$\begin{aligned} \langle \psi | {{\textsf{M}}}^n_{n',t}(\Delta ) \psi \rangle = \int _{\Delta } J^{\psi \mu }_{n}(x) n'_\mu \textrm{d}\Sigma _{n',t'}(x) \,\,, \forall \psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\,. \end{aligned}$$
(66)

Furthermore the following holds.

  1. (1)

    It has the form, in terms of the Newton–Wigner POVM \({{\textsf{Q}}}_{n',t'}\) on \(\Sigma _{n',t'}\),

    $$\begin{aligned}{} & {} {{\textsf{M}}}^n_{n',t'}(\Delta ) =\frac{1}{2}\left( \sqrt{\frac{H_{n'}}{H_{n}}}{{\textsf{Q}}}_{n',t'}(\Delta ) \sqrt{\frac{H_{n}}{H_{n'}}} + \sqrt{\frac{H_{n}}{H_{n'}}} {{\textsf{Q}}}_{n',t'}(\Delta ) \sqrt{\frac{H_{n'}}{H_{n}}}\right) \nonumber \\{} & {} -\frac{n\cdot n}{2} \sqrt{\frac{H_{n'}}{H_{n}}}\left( \eta ^{\mu \nu }\frac{P_{n\mu }}{H_{n'}} {{\textsf{Q}}}_{n',t'}(\Delta ) \frac{P_{n\nu }}{H_{n'}} + \frac{m}{H_{n'}} {{\textsf{Q}}}_{n',t'}(\Delta ) \frac{m}{H_{n'}} \right) \sqrt{\frac{H_{n'}}{H_{n}}}\,. \end{aligned}$$
    (67)

    (where the various everywhere-defined bounded composite operators \(H_n/H_{n'}\), etc., are defined in terms of the joint spectral measure of \(P^\mu \) and standard spectral calculus).

  2. (2)

    It reduces to the Terno POVM for \(n=n'\):

    $$\begin{aligned} {{\textsf{M}}}^n_{n,t}(\Delta ) = {{\textsf{A}}}_{n,t}(\Delta )\,, \text{ if }~n\in {{\textsf{T}}}_+, t\in {{\mathbb {R}}}~\text{ and }~ \Delta \in {\mathscr {L}}(\Sigma _{n,t}). \end{aligned}$$
    (68)
  3. (3)

    The \(IO(1,3)_+\) covariance relations are valid,

    $$\begin{aligned} U_{h} {{\textsf{M}}}^n_{n',t'}(\Delta ) U_{h}^{-1} = {{\textsf{M}}}^{\Lambda _h n}_{\Lambda _h n', t'_h}(h\Delta ) \,, \quad \forall \Delta \in {\mathscr {L}}(\Sigma _{n',t'})\,, \quad \forall h \in IO(1,3)_+\,. \nonumber \\ \end{aligned}$$
    (69)

Proof

(Initial statement and (1)). Let us call F the operator defined by the right-hand side of (67). It is evidently everywhere defined and bounded on \({{{\mathcal {H}}}}\). By polarization and density of \({{{\mathcal {S}}}}({{{\mathcal {H}}}})\), it is completely determined by the values \(\langle \psi |F \psi \rangle \) when \(\psi \in {{{\mathcal {S}}}}({{{\mathcal {H}}}})\). Let us prove that it satisfies (66). Per direct inspection we have that, if \(\psi \in {{{\mathcal {S}}}}({{\mathcal {H}}})\), taking (51) and (52) into account, the right-hand side of (66) can be written, with \(-n'\cdot x= t'\)

$$\begin{aligned}{} & {} \int _{{\textsf{V}}_{m,+}}\textrm{d}\mu (p) \int _{\Delta }\textrm{d}\Sigma _{n',t'}(x) \int _{V_{m,+}} \textrm{d}\mu (q) \frac{e^{-i(q-p)\cdot x}}{(2\pi )^3}\\{} & {} \quad \frac{E_n(p)E_{n'}(q)+ E_n(q) E_{n'}(p) - n\cdot n'(p^\alpha q_\alpha + m^2)}{2 \sqrt{ E_n(q) E_n(p)}} \overline{\psi (p)}\psi (q) \end{aligned}$$

which, in turn, coincides with \(\langle \psi |F \psi \rangle \) when taking (17) into account, as wanted. Notice that (66) implies that the everywhere defined extended operator \({{\textsf{M}}}^n_{n',t'}(\Delta )\) is positive as it is the continuous extension of a positive operator. The family of these operators, with \(n, n',t'\) fixed, is also weakly \(\sigma \)-additive in \(\Delta \) because \({{\textsf{Q}}}_{n',t'}\) in the right-hand side of (67) is weakly \(\sigma \)-additive, and the operators appearing as factors are bounded and everywhere defined. As the family \({{\textsf{M}}}^n_{n',t'}(\Delta )\), with \(\Delta \) variable in \({\mathscr {L}}(\Sigma _{n',t'})\), is made of positive operators with \({{\textsf{M}}}^n_{n',t'}(\Sigma _{n',t'})=I\) (direct inspection), we conclude that the said family (with n fixed) is a (normalized) POVM on \({\mathscr {L}}(\Sigma _{n',t'})\).

(2) It is obvious from (67) and (38).

(3) The proof immediately arises from the analogous covariance properties of \({{\textsf{Q}}}_{n,t}\) and the basic covariance properties of \({{\textsf{H}}}_n\) and composite (bounded everywhere defined) operators \(H_n/N_{n'}\), \(m/H_{n}\), \(P_{n'}^\mu /H_n\). \(\square \)

Remark 38

For a given \(n_0 \in {{\textsf{T}}}_+\), the physical meaning of the family of POVMs

$$\begin{aligned} {{\textsf{M}}}^{n_0}:= \{{{\textsf{M}}}^{n_0}_{n, t}\}_{n\in {{\textsf{T}}}_+, t\in {{\mathbb {R}}}} \end{aligned}$$

is the notion of spatial position observable, referred to all reference frames \(n\in {{\textsf{T}}}_+\) and every global time \(t\in {{\mathbb {R}}}\) of each such reference frame, when the used class of detectors is always co-moving with \(n_0\).

To conclude this work, I prove that for every given \(n_0 \in {{\textsf{T}}}_+\), the family of POVMs \({{\textsf{M}}}_{n_0}\) satisfies Castrigiano’s causality condition.

Theorem 39

For given \(n_0\in {{\textsf{T}}}_+\) and \(\psi \in {{{\mathcal {H}}}}\), define the family of probability measures \(\mu ^{\psi , n_0}_{n,t}\)

$$\begin{aligned} \mu ^{\psi , n_0}_{n,t}(\Delta ):= \langle \psi |{{\textsf{M}}}^{n_0}_{n,t}(\Delta ) \psi \rangle \,, \quad n\in {{\textsf{T}}}_+, t\in {{\mathbb {R}}}, \Delta \in {\mathscr {L}}(\Sigma _{n,t})\,. \end{aligned}$$

That family satisfies Castrigiano’s causality condition (b) in Definition 15.

$$\begin{aligned} \mu ^{\psi , n_0}_{n,t}(\Delta ) \le \mu ^{\psi , n_0}_{n',t'}(\Delta ') \quad \forall n, n'\in {{\textsf{T}}}_+ \,, \forall t, t'\in {{\mathbb {R}}}\,, \forall \Delta \in {\mathscr {L}}(\Sigma _{n,t}) \end{aligned}$$

where \(\Delta ':= \left( J^+(\Delta ) \cup J^-(\Delta ) \right) \cap \Sigma _{n',t'}\).

In particular, the time evolution associated to every n is causal according to (a) Definition 15.

Sketch of proof. Condition (a) in Definition 15 is satisfied if condition (b) holds, so that it suffices to prove the validity of the latter. The proof of Theorem 35 and its preparatory lemmata can be performed also for the considered case since the only relevant two facts, for \(\psi \in {{{\mathcal {S}}}}(\mathcal{H})\), are that (i) the values of \(\mu ^{\psi , n_0}_{n,t}(\Delta )\) and \(\mu ^{\psi , n_0}_{n',t'}(\Delta ')\)—where for the moment \(n=n'=n_0\)—are spatial boundary integrals of the conserved four current \(J^\psi _{n_0}\) and (ii) that \(J^\psi _{n_0}\) is either zero or causal and past directed. These facts are valid also dropping the requirement \(n=n'=n_0\). It does not matter if the normal vectors n and \(n'\) to the two hyperplanes containing, respectively, \(\Delta \) and \(\Delta '\) are both parallel to the vector \(n_0\) defining \(J^\psi _{n_0}\) or not, so we can definitely drop the requirement \(n=n'=n_0\). Indeed, in proving Theorem 35 the bases of the four-dimensional solid used to integrate the current were orthogonal to \(n_0\) just as a contingent fact, due to the very definition of the measures \(\mu ^\psi _{n,t}\) which is now relaxed. The only case where the above proof has to be slightly changed is when the possible intersection of \(\Sigma _{n,t}\) and \(\Sigma _{n',t'}\) passes through \(\Delta \). In that case it is convenient to treat separately the two parts of \(\Delta \). \(\Box \)

8 Discussion

In this work, I rigorously proved that, when referring to the only issue (I1) of the Introduction, a spatial notion of localization for a massive Klein Gordon particle is possible without problems with causality (with some caveat, however, see below), avoiding the pathologies predicted by Hegerfeldt’s theorem in particular. As is well known from long time, this latter obstruction prevents, in particular, the existence of spatially localized states. The crucial mathematical notion is here the covariant family of POVMs \({{\textsf{A}}}\) proposed by Terno [36] which has been analyzed with a broad mathematical detail, focusing on its interplay with the popular Newton–Wigner notion of spatial localization. This analysis showed that the notion of localization based on the POVM \({{\textsf{A}}}\) and the associated first moment in particular, keep many good properties of the Newton–Wigner localization notion while they drop many problematic issues. To what extent this notion is compatible with the interplay of causality and post-measurement state (I2) was not the object of this work and it will be investigated elsewhere. Terno’s notion seems in good agreement with Castrigiano’s notion of causal evolution ((a) Definition 15). The validity of the very Castrigiano causality condition ((b) Definition 15) needs more care and a different, perhaps physically more subtle, analysis than the case of causal systems rigorously treated by Castrigiano [8]. Terno’s notion of spatial localization relies upon the notion of energy density and not upon the notion of density of charge. The former is associated to a conserved tensor field, the stress energy tensor \(T_{\mu \nu }\), instead of a vector field. As a matter of fact, the relevant probability density in the reference frame n is the normalized energy density \(T_{\mu \nu } n^\mu n^\nu \). This choice as the apparent drawback that probability densities of different reference frames result to be incomparable, just because the densities \(T_{\mu \nu } n^\mu n^\nu \) and \(T_{\mu \nu } n'^\mu n'^\nu \) are not connected by the standard argument based on the conservation law \(\partial _\mu T^{\mu \nu }=0\) and the Stokes–Poincaré theorem. That law permits to compare different boundary terms where only one normal vector is changed instead of one pair at a time: \(n,n \rightarrow n',n'\). To test Castrigiano’s causality condition seems to be impossible along that way. However, the physical interpretation turns out to be of some help at this juncture. The twice presence of n can be relaxed to a single occurrence of a pair of different timelike future-oriented unit vectors, \(n,n'\). The fact that the density \(T^{\mu \nu } n_\mu n_\nu '\) is still positive suggests a new and different operational interpretation of the notion of spatial position. To assert that the particle stays in \(\Delta \subset \Sigma _{n',t'}\) one should not only specify the reference frame \(n'\) and the instant of time \(t'\), but one should also make explicit our choice of the rest frame n of the employed detectors (which actually are energy detectors). The relevant density therefore is \(J_{n}^\mu n'_\mu \ge 0\), where \(J_{n}^\mu := n^\nu T_{\nu }^\mu \). This picture includes the apparently most natural choice is \(n= n'\), but one is also allowed to pick out \(n\ne n'\). Keeping fixed n and varying \(n'\) produces a new family of POVMs \({{\textsf{M}}}^n_{n',t'}\) when one varies \(n'\) and \(t'\). This family satisfies both requirements (a) and (b) in Definition 15, in particular, Castrigiano’s causality condition (b). It is not clear to the author if this approach is really physically meaningful and the subject certainly deserves further investigation and discussion.

Actually something can be said about the causal relation of \({{\textsf{A}}}_{n,t}(\Delta _1)\) and \({{\textsf{A}}}_{n',t'}(\Delta _2)\), where \(\Delta _2 = (J^+(\Delta _1) \cup J^-(\Delta _1)) \cap \Sigma _{n',t'}\) and \(n\ne n'\), on the ground of a pure mathematical observation. However, it is not clear if this reasoning may lead to a proof of Castrigano’s causality condition, especially because there is no evident physical reason behind the following argument. If one assumes that \(\psi \in {{{\mathcal {D}}}}({{{\mathcal {H}}}})\), and that \(\Delta _1\subset \Sigma _{n,t_1}\) has the special form as in Proposition 33, then the sharp inequality (61) is valid. Therefore, for continuity reasons, keeping fixed \(\psi \), n and \(t=t_1\) on the left-hand side of (61), that inequality must be still valid if one slightly changes \(n'=n\) and \(t'= t_2\), and \(\Delta _2\) accordingly. If the neighborhood of values \((n',t')\) around (nt) where this inequality holds were the entire \({{\textsf{T}}}_+\times {{\mathbb {R}}}\), one could use an improvement of the argument already exploited in the main text to pass from the special type of set \(\Delta _1\) to a generic element of \({\mathscr {L}}(\Sigma _{n,t})\), possibly relaxing < to \(\le \). The usual density argument of \({{{\mathcal {D}}}}({{{\mathcal {H}}}})\) in \({{{\mathcal {H}}}}\) would conclude the proof. However, I do not think that the said neighborhood of (nt) covers the full set of possibilities of the choice of \((n',t')\). All that will be investigated elsewhere.