1 Introduction

1.1 On the classical mathematical formulation of quantum mechanics

Throughout this paper H will denote a complex, not necessarily separable, Hilbert space with dimension at least 2. In the classical mathematical formulation of quantum mechanics such a space is used to describe experiments at the atomic scale. For instance, the famous Stern–Gerlach experiment (which was one of the firsts showing the reality of the quantum spin) can be described using the two-dimensional Hilbert space \({\mathbb {C}}^2\). In the classical formulation of quantum mechanics, the space of all rank-one projections \({{\mathcal {P}}}_1(H)\) plays an important role, as its elements represent so-called quantum pure-states (in particular in the Stern–Gerlach experiment they represent the quantum spin). The so-called transition probability between two pure states \(P, Q \in {{\mathcal {P}}}_1(H)\) is the number \(\mathrm{tr}PQ\), where \(\mathrm{tr}\) denotes the trace. For the physical meaning of this quantity we refer the interested reader to e.g. [33]. A very important cornerstone of the mathematical foundations of quantum mechanics is Wigner’s theorem, which states the following.

Wigner’s Theorem

Given a bijective map \(\phi :{{\mathcal {P}}}_1(H)\rightarrow {{\mathcal {P}}}_1(H)\) that preserves the transition probability, i.e. \(\mathrm{tr}\phi (P)\phi (Q) = \mathrm{tr}PQ\) for all \(P,Q\in {{\mathcal {P}}}_1(H)\), one can always find either a unitary, or an antiunitary operator \(U:H\rightarrow H\) that implements \(\phi \), i.e. we have \(\phi (P) = UPU^*\) for all \(P\in {{\mathcal {P}}}_1(H)\).

For an elementary proof see [11]. As explained thoroughly by Simon in [29], this theorem plays a crucial role (together with Stone’s theorem and some representation theory) in obtaining the general time-dependent Schrödinger equation that describes quantum systems evolving in time (and which is usually written in the form \(i \hslash \tfrac{d}{dt} |\Psi (t)\rangle = {\hat{H}} |\Psi (t)\rangle \), where \(\hslash \) is the reduced Planck constant, \({\hat{H}}\) is the Hamiltonian operator, and \(|\Psi (t)\rangle \) is the unit vector that describes the system at time t).

One of the main objectives of quantum mechanics is the study of measurement. In the classical formulation an observable (such as the position/momentum of a particle, or a component of a particle’s spin) is represented by a self-adjoint operator. Equivalently, we could say that an observable is represented by a projection-valued measure \(E:{{\mathcal {B}}}_{\mathbb {R}}\rightarrow {{\mathcal {P}}}(H)\) (i.e. the spectral measure of the representing self-adjoint operator), where \({{\mathcal {B}}}_{\mathbb {R}}\) denotes the set of all Borel sets in \({\mathbb {R}}\) and \({{\mathcal {P}}}(H)\) the space of all projections (also called sharp effects) acting on H. If \(\Delta \) is a Borel set, then the quantum event that we get a value in \(\Delta \) corresponds to the projection \(E(\Delta )\). However, this mathematical formulation of observables implicitly assumes that measurements are perfectly accurate, which is far from being the case in real life. This was the crucial thought which led Ludwig to give an alternative axiomatic formulation of quantum mechanics which was introduced in his famous books [18] and [19].

1.2 On Ludwig’s mathematical formulation of quantum mechanics

This paper is related to Ludwig’s formulation of quantum mechanics, more precisely, we shall examine one of the theory’s most important relations, called coexistence (see the definition later). The main difference compared to the classical formulation is that (due to the fact that no perfectly accurate measurement is possible in practice) quantum events are not sharp, but fuzzy. Therefore, according to Ludwig, a quantum event is not necessarily a projection, but rather a self-adjoint operator whose spectrum lies in [0, 1]. Such an operator is called an effect, and the set of all such operators is called the Hilbert space effect algebra, or simply the effect algebra, which will be denoted by \({{\mathcal {E}}}(H)\). Clearly, we have \({{\mathcal {P}}}(H)\subset {{\mathcal {E}}}(H)\). A fuzzy or unsharp quantum observable corresponds to an effect-valued measure on \({{\mathcal {B}}}_{\mathbb {R}}\), which is often called a normalised positive operator-valued measure, see e.g. [13] for more details on this. We point out that the role of effects and positive operator-valued measures was already emphasised in the earlier book [8] of Davies. For some of the subsequent contributions to the theory we refer the reader to the work of Kraus [17] and the recent book of Busch–Lahti–Pellonpää–Ylinen [3].

Let us point out that, contradicting to its name, \({{\mathcal {E}}}(H)\) is obviously not an actual algebra. There are a number of operations and relations on the effect algebra that are relevant in mathematical physics. First of all, the usual partial order \(\le \), defined by \(A\le B\) if and only if \(\langle Ax, x \rangle \le \langle B x , x \rangle \) for all \(x \in H\), expresses that the occurrence of the quantum event A implies the occurrence of B. We emphasise that \(({{\mathcal {E}}}(H),\le )\) is not a lattice, because usually there is no largest effect C whose occurrence implies both A and B (see [1, 25, 31] for more details on this). Note that, as can be easily shown, we have \({{\mathcal {E}}}(H) = \{A \in {{\mathcal {B}}}(H) :A=A^*, 0\le A\le I\}\), where \({{\mathcal {B}}}(H)\) denotes the set of all bounded operators on H, \(A^*\) the adjoint of A, and I the identity operator. Hence sometimes the literature refers to \({{\mathcal {E}}}(H)\) as the operator interval [0, I].

Second, the so called ortho-complementation \(\perp \) is defined by \(A^\perp = I - A\), and it can be thought of as the complement event (or negation) of A, i.e. A occurs if and only if \(A^\perp \) does not.

We are mostly interested in the relation of coexistence. Ludwig called two effects coexistent if they can be measured together by applying a suitable apparatus. In the language of mathematics (see [18, Theorem IV.1.2.4]), this translates into the following definition:

Definition 1.1

\(A, B \in {{\mathcal {E}}}(H)\) are said to be coexistent, in notation \(A \sim B\), if there are effects \(E,F,G \in {{\mathcal {E}}}(H)\) such that

$$\begin{aligned} A = E + G, \quad B = F + G \quad \text {and}\quad E+F+G \in {{\mathcal {E}}}(H). \end{aligned}$$

We point out that in the earlier work [8] Davies examined the simultaneous measurement of unsharp position and momentum, which is closely related to coexistence. It is apparent from the definition that coexistence is a symmetric relation. Although it is not trivial from the above definition, two sharp effects \(P,Q \in {{\mathcal {P}}}(H)\) are coexistent if and only if they commute (see Sect. 2), which corresponds to the classical formulation. We will denote the set of all effects that are coexistent with \(A\in {{\mathcal {E}}}(H)\) by

$$\begin{aligned} A^\sim := \{ C\in {{\mathcal {E}}}(H):C\sim A \}, \end{aligned}$$

and more generally, if \({\mathcal {M}} \subset {{\mathcal {E}}}(H)\), then \({\mathcal {M}}^\sim := \cap \{ A^\sim :A\in {\mathcal {M}}\}\).

The relation of order in \({{\mathcal {E}}}(H)\) is fairly well-understood. However, the relation of coexistence is very poorly understood. In the case of qubit effects (i.e. when \(\dim H=2\)) the recent papers of Busch–Schmidt [4], Stano–Reitzner–Heinosaari [30] and Yu–Liu–Li–Oh [36] provide some (rather complicated) characterisations of coexistence. Although there are no similar results in higher dimensions, it was pointed out by Wolf–Perez–Garcia–Fernandez in [35] that the question of coexistence of pairs of effects can be phrased as a so-called semidefinite program, which is a manageable numerical mathematical problem. We also mention that Heinosaari–Kiukas–Reitzner in [14] generalised the qubit coexistence characterisation to pairs of effects in arbitrary dimensions that belong to the von Neumann algebra generated by two projections.

To illustrate how poorly the relation of coexistence is understood, we note that the following very natural question has not been answered before—not even for qubit effects:

What does it mean for two effects A and B to be coexistent with exactly the same effects?

As our first main result we answer this very natural question. Namely, we will show the following theorem, where \({{\mathcal {F}}}(H)\) and \(\mathcal {SC}(H)\) denote the set off all finite-rank and scalar effects on H, respectively.

Theorem 1.1

For any effects \(A,B\in {{\mathcal {E}}}(H)\) the following are equivalent:

  1. (i)

    \(B\in \{A, A^\perp \}\) or \(A,B \in \mathcal {SC}(H)\),

  2. (ii)

    \(A^\sim = B^\sim \).

Moreover, if H is separable, then the above statements are also equivalent to

  1. (iii)

    \(A^\sim \cap {{\mathcal {F}}}(H) = B^\sim \cap {{\mathcal {F}}}(H)\).

Physically speaking, the above theorem says that the (unsharp) quantum events A and B can be measured together with exactly the same quantum events if and only if they are the same, or they are each other’s negation, or both of them are scalar effects.

1.3 Automorphisms of \({{\mathcal {E}}}(H)\) with respect to two relations

Automorphisms of mathematical structures related to quantum mechanics are important to study because they provide the right tool to understand the time-evolution of certain quantum systems (see e.g. [18, Chapters V-VII] or [29]). In case when this mathematical structure is \({{\mathcal {E}}}(H)\), we call a map \(\phi :{{\mathcal {E}}}(H)\rightarrow {{\mathcal {E}}}(H)\) a standard automorphism of the effect algebra if there exists a unitary or antiunitary operator \(U:H\rightarrow H\) that (similarly to Wigner’s theorem) implements \(\phi \), i.e. we have

$$\begin{aligned} \phi (A) = UAU^*\qquad (A\in {{\mathcal {E}}}(H)). \end{aligned}$$

Obviously, standard automorphisms are automorphisms with respect to the relations of order:

figure a

of ortho-complementation:

figure b

and also of coexistence:

figure c

One of the fundamental theorems in the mathematical foundations of quantum mechanics states that every ortho-order automorphism is a standard automorphism, which was first stated by Ludwig.

Ludwig’s Theorem

(1954, Theorem V.5.23 in [18]). Let H be a Hilbert space with \(\dim H \ge 2\). Assume that \(\phi :{{\mathcal {E}}}(H) \rightarrow {{\mathcal {E}}}(H)\) is a bijective map satisfying (\(\le \)) and (\(\perp \)). Then \(\phi \) is a standard automorphism of \({{\mathcal {E}}}(H)\). Conversely, every standard automorphism satisfies (\(\le \)) and (\(\perp \)).

We note that Ludwig’s proof was incomplete and that he formulated his theorem under the additional assumption that \(\dim H \ge 3\). The reader can find a rigorous proof of this version for instance in [5]. Let us also point out that the two-dimensional case of Ludwig’s theorem was only proved in 2001 in [22].

It is very natural to ask whether the conclusion of Ludwig’s theorem remains true, if one replaces either (\(\le \)) by (\(\sim \)), or (\(\perp \)) by (\(\sim \)). Note that in light of Theorem 1.1, in the former case the condition (\(\perp \)) becomes almost redundant, except on \(\mathcal {SC}(H)\). However, as scalar effects are exactly those that are coexistent with every effect (see Sect. 2), this problem basically reduces to the characterisation of automorphisms with respect to coexistence only—which we shall consider later on.

In 2001, Molnár answered the other question affirmatively under the assumption that \(\dim H \ge 3\).

Molnár’s Theorem (2001, Theorem 1 in [20]). Let H be a Hilbert space with \(\dim H \ge 3\). Assume that \(\phi :{{\mathcal {E}}}(H) \rightarrow {{\mathcal {E}}}(H)\) is a bijective map satisfying ( \(\le \)) and ( \(\sim \)). Then \(\phi \) is a standard automorphism of \({{\mathcal {E}}}(H)\). Conversely, every standard automorphism satisfies ( \(\le \)) and ( \(\sim \)).

In this paper we shall prove the two-dimensional version of Molnár’s theorem.

Theorem 1.2

Assume that \(\phi :{{\mathcal {E}}}({\mathbb {C}}^2) \rightarrow {{\mathcal {E}}}({\mathbb {C}}^2)\) is a bijective map satisfying (\(\le \)) and (\(\sim \)). Then \(\phi \) is a standard automorphism of \({{\mathcal {E}}}({\mathbb {C}}^2)\). Conversely, every standard automorphism satisfies (\(\le \)) and (\(\sim \)).

Note that Molnár used the fundamental theorem of projective geometry to prove the aforementioned result, therefore his proof indeed works only if \(\dim H \ge 3\). Here, as an application of Theorem 1.1, we shall give an alternative proof of Molnár’s theorem that does not use this dimensionality constraint, hence fill this dimensionality gap in. More precisely, we will reduce Molnár’s theorem and Theorem 1.2 to Ludwig’s theorem (see the end of Sect. 2).

1.4 Automorphisms of \({{\mathcal {E}}}(H)\) with respect to only one relation

It is certainly a much more difficult problem to describe the general form of automorphisms with respect to only one relation. Of course, here we mean either order preserving (\(\le \)), or coexistence preserving (\(\sim \)) maps, as it is easy (and not at all interesting) to describe bijective transformations that satisfy (\(\perp \)). It has been known for quite some time that automorphisms with respect to the order relation on \({{\mathcal {E}}}(H)\) may differ a lot from standard automorphisms, although they are at least always continuous with respect to the operator norm. We do not state the related result here, but only mention that the answer finally has been given by the second author in [26, Corollary 1.2] (see also [28]).

The other main purpose of this paper is to give the characterisation of all automorphisms of \({{\mathcal {E}}}(H)\) with respect to the relation of coexistence. As can be seen from our result below, these maps can also differ a lot from standard automorphisms, moreover, unlike in the case of (\(\le \)) they are in general not even continuous.

Theorem 1.3

Let H be a Hilbert space with \(\dim H \ge 2\), and \(\phi :{{\mathcal {E}}}(H) \rightarrow {{\mathcal {E}}}(H)\) be a bijective map that satisfies (\(\sim \)). Then there exists a unitary or antiunitary operator \(U:H\rightarrow H\) and a bijective map \(g:[0,1] \rightarrow [0,1]\) such that we have

$$\begin{aligned} \{\phi (A), \phi (A^\perp )\} = \{UAU^*, UA^\perp U^*\} \qquad (A\in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)) \end{aligned}$$


$$\begin{aligned} \phi (tI) = g(t)I \qquad (t\in [0,1]). \end{aligned}$$

Conversely, every map of the above form preserves coexistence in both directions.

Observe that in the above theorem if we assume that our automorphism is continuous with respect to the operator norm, then up to unitary-antiunitary equivalence we obtain that \(\phi \) is either the identity map, or the ortho-complementation: \(A\mapsto A^\perp \). Also note that the converse statement of the theorem follows easily by Theorem 1.1. As we mentioned earlier, the description of all automorphisms with respect to (\(\sim \)) and (\(\perp \)) now follows easily, namely, we get the same conclusion as in the above theorem, except that now g further satisfies \(g(1-t) = 1-g(t)\) for all \(0\le t\le 1\).

1.5 Quantum mechanical interpretation of automorphisms of \({{\mathcal {E}}}(H)\)

In order to explain the above automorphism theorems’ physical interpretation, let us go back first to Wigner’s theorem. Assume there are two physicists who analyse the same quantum mechanical system using the same Hilbert space H, but possibly they might associate different rank-one projections to the same quantum (pure) state. However, we know that they always agree on the transition probabilities. Then according to Wigner’s theorem, there must be either a unitary, or an antiunitary operator with which we can transform from one analysis into the other (like a ”coordinate transformation”).

For the interpretation of Ludwig’s theorem, let us say there are two physicists who analyse the same quantum fuzzy measurement, but they might associate different effects to the same quantum fuzzy event. If we at least know that both of them agree on which pairs of effects are ortho-complemented, and which effect is larger than the other (i.e. implies the occurrence of the other), then by Ludwig’s theorem there must exist either a unitary, or an antiunitary operator that gives us the way to transform from one analysis into the other.

As for the interpretation of our Theorem 1.3, if we only know that our physicists agree on which pairs of effects are coexistent (i.e. which pairs of quantum events can be measured together), then there is a map \(\phi \) satisfying (2) and (3) that transforms the first physicist’s analysis into the other’s.

1.6 The outline of the paper

In the next section we will prove our first main result, Theorem 1.1, and as an application, we prove Molnár’s theorem in an alternative way that works for qubit effects as well. This will be followed by Sect. 3 where we prove our other main result, Theorem 1.3, in the case when \(\dim H = 2\). Then in Sect. 4, using the two-dimensional case, we shall prove the general version of our result. Let us point out once more that, unless otherwise stated, H is not assumed to be separable. We will close our paper with some discussion on the qubit case and some open problems in Sects. 56.

2 Proofs of Theorems 1.11.2, and Molnár’s Theorem

We start with some definitions. The symbol \({{\mathcal {P}}}(H)\) will stand for the set of all projections (idempotent and self-adjoint operators) on H, and \({{\mathcal {P}}}_1(H)\) will denote the set of all rank-one projections. The commutant of an effect A intersected with \({{\mathcal {E}}}(H)\) will be denoted by

$$\begin{aligned} A^c := \{ C\in {{\mathcal {E}}}(H) :CA=AC\}, \end{aligned}$$

and more generally, for a subset \({\mathcal {M}}\subset {{\mathcal {E}}}(H)\) we will use the notation \({\mathcal {M}}^c := \cap \{ A^c :A\in {\mathcal {M}} \}\). Also, we set \(A^{cc} := (A^c)^c\) and \({\mathcal {M}}^{cc} := ({\mathcal {M}}^c)^c\).

We continue with three known lemmas on the structure of coexistent pairs of effects that can all be found in [27]. The first two have been proved earlier, see [4, 21].

Lemma 2.1

For any \(A\in {{\mathcal {E}}}(H)\) and \(P \in {{\mathcal {P}}}(H)\) the following statements hold:

  1. (a)

    \(A^\sim = {{\mathcal {E}}}(H)\) if and only if \(A \in \mathcal {SC}(H)\),

  2. (b)

    \(P^\sim = P^c\),

  3. (c)

    \(A^c \subseteq A^\sim \).

Lemma 2.2

Let \(A,B \in {{\mathcal {E}}}(H)\) so that their matrices are diagonal with respect to some orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\), i.e. \(A = \oplus _{i\in \mathcal {I}} A_i\) and \(B = \oplus _{i\in \mathcal {I}} B_i \in {{\mathcal {E}}}(\oplus _{i\in \mathcal {I}} H_i)\). Then \(A\sim B\) if and only if \(A_i\sim B_i\) for all \(i\in \mathcal {I}\).

Lemma 2.3

Let \(A,B \in {{\mathcal {E}}}(H)\). Then the following are equivalent:

  1. (i)

    \(A \sim B\),

  2. (ii)

    there exist effects \(M,N \in {{\mathcal {E}}}(H)\) such that \(M \le A\), \(N \le I-A\), and \(M+N = B\).

We continue with a corollary of Lemma 2.1.

Corollary 2.4

For any effect A and projection \(P\in A^\sim \) we have \(P\in A^c\). In particular, we have \(A^\sim \cap {{\mathcal {P}}}(H) = A^c\cap {{\mathcal {P}}}(H)\).


Since coexistence is a symmetric relation, we obtain \(A\in P^\sim \), which implies \(AP=PA\). \(\square \)

The next four statements are easy consequences of Lemma 2.3, we only prove two of them.

Corollary 2.5

For any effect A we have \(A^\sim = \left( A^\perp \right) ^\sim \).

Corollary 2.6

Let \(A\in {{\mathcal {E}}}(H)\) such that either \(0\notin \sigma (A)\), or \(1\notin \sigma (A)\). Then there exists an \(\varepsilon >0\) such that \(\{C\in {{\mathcal {E}}}(H):C\le \varepsilon I\} \subseteq A^\sim \).

We recall the definition of the strength function of \(A\in {{\mathcal {E}}}(H)\):

$$\begin{aligned} \Lambda (A,P) = \max \{ \lambda \ge 0 :\lambda P \le A \} \qquad (P\in {{\mathcal {P}}}_1(H)), \end{aligned}$$

see [2] for more details and properties.

Corollary 2.7

Assume that \(A\in {{\mathcal {E}}}(H)\), \(0 < t \le 1\), and \(P\in {{\mathcal {P}}}_1(H)\). Then the following conditions are equivalent:

  1. (i)

    \(A\sim tP\);

  2. (ii)
    $$\begin{aligned} t \le \Lambda (A,P) + \Lambda (A^\perp ,P). \end{aligned}$$


By (ii) of Lemma 2.3 we have \(A\sim tP\) if and only if there exist \(t_1, t_2 \ge 0\) such that \(t = t_1 + t_2\), \(t_1 P \le A\) and \(t_2 P \le A^\perp \), which is of course equivalent to (4). \(\square \)

Corollary 2.8

Let \(A,B \in {{\mathcal {E}}}(H)\) such that \(A^\sim \subseteq B^\sim \). Assume that with respect to the orthogonal decomposition \(H = H_1 \oplus H_2\) the two effects have the following block-diagonal matrix forms:

$$\begin{aligned} A = \left[ \begin{matrix} A_1 &{} 0 \\ 0 &{} A_2 \end{matrix}\right] \qquad \text {and} \qquad B = \left[ \begin{matrix} B_1 &{} 0 \\ 0 &{} B_2 \end{matrix}\right] \in {{\mathcal {E}}}(H_1 \oplus H_2). \end{aligned}$$

Then we also have

$$\begin{aligned} A_1^\sim \subseteq B_1^\sim \qquad \text {and} \qquad A_2^\sim \subseteq B_2^\sim . \end{aligned}$$

In particular, if \(A^\sim = B^\sim \), then \(A_1^\sim = B_1^\sim \) and \(A_2^\sim = B_2^\sim \).


Let \(P_1\) be the orthogonal projection onto \(H_1\). By Lemma 2.2 we observe that

$$\begin{aligned}&\Bigg \{ \left[ \begin{matrix} C &{} 0 \\ 0 &{} D \end{matrix}\right] \in {{\mathcal {E}}}\left( H_1\oplus H_2 \right) :C \sim A_1, D \sim A_2 \Bigg \} = P_1^c \cap A^\sim \\&\qquad \subseteq P_1^c \cap B^\sim = \Bigg \{ \left[ \begin{matrix} E &{} 0 \\ 0 &{} F \end{matrix}\right] \in {{\mathcal {E}}}\left( H_1\oplus H_2 \right) :E \sim B_1, F \sim B_2 \Bigg \}, \end{aligned}$$

which immediately implies (5). \(\square \)

Next, we recall the Busch–Gudder theorem about the explicit form of the strength function, which we shall use frequently here. We also adopt their notation, so whenever it is important to emphasise that the range of a rank-one projection P is \({\mathbb {C}}\cdot x\) with some \(x\in H\) such that \(\Vert x\Vert =1\), we write \(P_x\) instead. Furthermore, the symbol \(A^{-1/2}\) denotes the algebraic inverse of the bijective restriction \(A^{1/2}|_{(\mathrm{Im\,}A)^-}:(\mathrm{Im\,}A)^-\rightarrow \mathrm{Im\,}(A^{1/2})\), where \(\cdot ^-\) stands for the closure of a set. In particular, for all \(x \in \mathrm{Im\,}(A^{1/2})\) the vector \(A^{-1/2} x\) is the unique element in \((\mathrm{Im\,}A)^-\) which \(A^{1/2}\) maps to x.

Busch–Gudder Theorem

(1999, Theorem 4 in [2]) For every effect \(A \in {{\mathcal {E}}}(H)\) and unit vector \(x\in H\) we have

$$\begin{aligned} \Lambda (A,P_x) = \left\{ \begin{matrix} \Vert A^{-1/2} x \Vert ^{-2}, &{} \text {if} \; x \in \mathrm{Im\,}(A^{1/2}), \\ 0, &{} \text {otherwise.} \end{matrix} \right. \end{aligned}$$

We proceed with proving some new results which will be crucial in the proofs of our main theorems. The first lemma is probably well-known, but as we did not find it in the literature, we state and prove it here. Recall that WOT and SOT stand for the weak- and strong operator topologies, respectively.

Lemma 2.9

For any effect \(A\in {{\mathcal {E}}}(H)\), the set \(A^\sim \) is convex and WOT-compact, hence it is also SOT- and norm-closed. Moreover, if H is separable, then the subset \(A^\sim \cap {{\mathcal {F}}}(H)\) is SOT-dense, hence also WOT-dense, in \(A^\sim \).


Let \(t\in [0,1]\) and \(B_1,B_2\in A^\sim \). By Lemma 2.3 there are \(M_1,M_2,N_1,N_2\in {{\mathcal {E}}}(H)\) such that \(M_1+N_1 = B_1\), \(M_2+N_2 = B_2\), \(M_1\le A, N_1\le I-A\) and \(M_2\le A, N_2\le I-A\). Hence setting \(M = tM_1+(1-t)M_2\in {{\mathcal {E}}}(H)\) and \(N = tN_1+(1-t)N_2\in {{\mathcal {E}}}(H)\) gives \(M+N = tB_1+(1-t)B_2\) and \(M\le A, N\le I-A\), thus \(tB_1+(1-t)B_2\sim A\), so \(A^\sim \) is indeed convex.

Next, we prove that \(A^\sim \) is WOT-compact. Clearly, \({{\mathcal {E}}}(H)\) is WOT-compact, as it is a bounded WOT-closed subset of \({{\mathcal {B}}}(H)\) (see [7, Proposition IX.5.5]), therefore it is enough to show that \(A^\sim \) is WOT-closed. Let \(\{B_\nu \}_\nu \subseteq A^\sim \) be an arbitrary net that WOT-converges to B, we shall show that \(B\sim A\) holds. For every \(\nu \) we can find two effects \(M_\nu \) and \(N_\nu \) such that \(M_\nu +N_\nu = B_\nu \), \(M_\nu \le A\) and \(N_\nu \le I-A\). By WOT-compactness of \({{\mathcal {E}}}(H)\), there exists a subnet \(\{B_\xi \}_\xi \) such that \(M_\xi \rightarrow M\) in WOT with some effect M. Again, by WOT-compactness of \({{\mathcal {E}}}(H)\), there exists a subnet \(\{B_\eta \}_\eta \) of the subnet \(\{B_\xi \}_\xi \) such that \(N_\eta \rightarrow N\) in WOT with some effect N. Obviously we also have \(B_\eta \rightarrow B\) and \(M_\eta \rightarrow M\) in WOT. Therefore we have \(M+N = B\) and by definition of WOT convergence we also obtain \(M\le A\), \(N\le I-A\), hence indeed \(B\sim A\). Closedness with respect to the other topologies is straightforward.

Concerning our last statement for separable spaces, first we point out that for every effect C there exists a net of finite rank effects \(\{C_\nu \}_\nu \) such that \(C_\nu \le C\) holds for all \(\nu \) and \(C_\nu \rightarrow C\) in SOT. Denote by \(E_C\) the projection-valued spectral measure of C, and set \(C_n = \sum _{j=0}^{n} \frac{j}{n} E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) for every \(n\in {\mathbb {N}}\). Clearly, each \(C_n\) has finite spectrum, satisfies \(C_n\le C\), and \(\Vert C_n - C\Vert \rightarrow 0\) as \(n\rightarrow \infty \). For each spectral projection \(E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) we can take a sequence of finite-rank projections \(\{P_k^{j,n}\}_{k=1}^\infty \) such that \(P_k^{j,n}\le E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) for all k and \(P_k^{j,n} \rightarrow E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) in SOT as \(k\rightarrow \infty \). Define \(C_{n,k} := \sum _{j=0}^{n} \frac{j}{n} P_k^{j,n}\). It is apparent that \(C_{n,k} \le C_n\) for all n and k, and that for each n we have \(C_{n,k} \rightarrow C_n\) in SOT as \(k\rightarrow \infty \). Therefore the SOT-closure of \(\{C_{n,k}:n,k\in {\mathbb {N}}\}\) contains each \(C_n\), hence also C, thus we can construct a net \(\{C_\nu \}_\nu \) with the required properties.

Now, let \(B\in A^\sim \) be arbitrary, and consider two other effects \(M,N\in {{\mathcal {E}}}(H)\) that satisfy the conditions of Lemma 2.3 (ii). Set \(C:= M\oplus N \in {{\mathcal {E}}}(H\oplus H)\), and denote by \(E_M\) and \(E_N\) the projection-valued spectral measures of M and N, respectively. Clearly, \(E_C\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) = E_M\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \oplus E_N\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) for each j and n. In the above construction we can choose finite-rank projections of the form \(P_k^{j,n} = Q_k^{j,n}\oplus R_k^{j,n} \in {{\mathcal {P}}}(H\oplus H)\) where \(Q_k^{j,n}, R_k^{j,n} \in {{\mathcal {P}}}(H)\), \(Q_k^{j,n}\le E_M\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) and \(R_k^{j,n}\le E_N\left( \left[ \frac{j}{n},\frac{j+1}{n}\right) \right) \) holds for all kn. Then each element \(C_\nu \) of the convergent net is an orthogonal sum of the form \(M_\nu \oplus N_\nu \in {{\mathcal {E}}}(H\oplus H))\). It is apparent that \(M_\nu , N_\nu \in {{\mathcal {F}}}(H)\), \(M_\nu \le M\) and \(N_\nu \le N\) for all \(\nu \), and that \(M_\nu \rightarrow M\), \(N_\nu \rightarrow N\) holds in SOT. Therefore \(M_\nu +N_\nu \in {{\mathcal {F}}}(H)\cap A^\sim \) and \(M_\nu +N_\nu \) converges to \(M+N = B\) in SOT, the proof is complete. \(\square \)

We proceed to investigate when do we have the equation \(A^\sim = B^\sim \) for two effects A and B, which will take several steps. We will denote the set of all rank-one effects by \({{\mathcal {F}}}_1(H) := \{tP :P\in {{\mathcal {P}}}_1(H), 0<t\le 1\}\).

Lemma 2.10

Let \(H = H_1\oplus H_2\) be an orthogonal decomposition and assume that \(A, B\in {{\mathcal {E}}}(H)\) have the following matrix decompositions:

$$\begin{aligned} A = \left[ \begin{matrix} \lambda _1 I_1 &{} 0 \\ 0 &{} \lambda _2 I_2 \end{matrix} \right] \quad \text {and} \quad B = \left[ \begin{matrix} \mu _1 I_1 &{} 0 \\ 0 &{} \mu _2 I_2 \end{matrix} \right] \; \in {{\mathcal {E}}}(H_1\oplus H_2) \end{aligned}$$

where \(\lambda _1, \lambda _2, \mu _1, \mu _2 \in [0,1]\), and \(I_1\) and \(I_2\) denote the identity operators on \(H_1\) and \(H_2\), respectively. Then the following are equivalent:

  1. (i)

    \(\Lambda (A,P) + \Lambda (A^\perp ,P) = \Lambda (B,P) + \Lambda (B^\perp ,P)\) holds for all \(P\in {{\mathcal {P}}}_1(H)\),

  2. (ii)

    \(A^\sim \cap {{\mathcal {F}}}_1(H) = B^\sim \cap {{\mathcal {F}}}_1(H)\),

  3. (iii)

    either \(\lambda _1=\lambda _2\) and \(\mu _1=\mu _2\), or \(\lambda _1=\mu _1\) and \(\lambda _2=\mu _2\), or \(\lambda _1 + \mu _1 = \lambda _2 + \mu _2 =1\).


The directions (iii)\(\Longrightarrow \)(ii)\(\iff \)(i) are trivial by Lemma 2.1 (a) and Corollaries 2.5, 2.7, so we shall only consider the direction (i)\(\Longrightarrow \)(iii). First, a straightforward calculation using the Busch–Gudder theorem gives the following for every \(x_1\in H_1, x_2\in H_2, \Vert x_1\Vert = \Vert x_2\Vert = 1\) and \(0\le \alpha \le \tfrac{\pi }{2}\):

$$\begin{aligned} \Lambda \left( A, P_{\cos \alpha x_1 + \sin \alpha x_2} \right) = \frac{1}{ \left( \tfrac{1}{\lambda _1}\right) \cdot \cos ^2\alpha + \left( \tfrac{1}{\lambda _2}\right) \cdot \sin ^2\alpha }, \end{aligned}$$

where we use the interpretations \(\tfrac{1}{0} = \infty \), \(\tfrac{1}{\infty } = 0\), \(\infty \cdot 0 = 0\), \(\infty + \infty = \infty \), and \(\infty + a = \infty \), \(\infty \cdot a = \infty \) (\(a>0\)), in order to make the formula valid also for the case when \(\lambda _1 = 0\) or \(\lambda _2 = 0\). Clearly, (8) depends only on \(\alpha \), but not on the specific choices of \(x_1\) and \(x_2\). We define the following two functions

$$\begin{aligned} T_A:\left[ 0,\tfrac{\pi }{2}\right] \rightarrow [0,1], \qquad T_A(\alpha ) = \Lambda \left( A, P_{\cos \alpha x_1 + \sin \alpha x_2} \right) + \Lambda \left( A^\perp , P_{\cos \alpha x_1 + \sin \alpha x_2} \right) \end{aligned}$$


$$\begin{aligned} T_B:\left[ 0,\tfrac{\pi }{2}\right] \rightarrow [0,1], \qquad T_B(\alpha ) = \Lambda \left( B, P_{\cos \alpha x_1 + \sin \alpha x_2} \right) + \Lambda \left( B^\perp , P_{\cos \alpha x_1 + \sin \alpha x_2} \right) , \end{aligned}$$

which are the same by our assumptions. By (8), for all \(0\le \alpha \le \tfrac{\pi }{2}\) we have

$$\begin{aligned} T_A(\alpha )&= \frac{1}{ \left( \tfrac{1}{\lambda _1}\right) \cdot \cos ^2\alpha + \left( \tfrac{1}{\lambda _2}\right) \cdot \sin ^2\alpha } + \frac{1}{ \left( \tfrac{1}{1-\lambda _1}\right) \cdot \cos ^2\alpha + \left( \tfrac{1}{1-\lambda _2}\right) \cdot \sin ^2\alpha } \nonumber \\&= \frac{1}{ \left( \tfrac{1}{\mu _1}\right) \cdot \cos ^2\alpha + \left( \tfrac{1}{\mu _2}\right) \cdot \sin ^2\alpha } + \frac{1}{ \left( \tfrac{1}{1-\mu _1}\right) \cdot \cos ^2\alpha + \left( \tfrac{1}{1-\mu _2}\right) \cdot \sin ^2\alpha } =T_B(\alpha ). \end{aligned}$$

Next, we observe the following implications:

  • if \(\lambda _1 = \lambda _2\), then \(T_A(\alpha )\) is the constant 1 function,

  • if \(\lambda _1 = 0\) and \(\lambda _2 = 1\), then \(T_A(\alpha )\) is the characteristic function \(\chi _{\{0,\pi /2\}}(\alpha )\),

  • if \(\lambda _1 = 0\) and \(0< \lambda _2 < 1\), then \(T_A(\alpha )\) is continuous on \(\left[ 0,\tfrac{\pi }{2}\right) \), but has a jump at \(\tfrac{\pi }{2}\), namely \(\lim _{\alpha \rightarrow \tfrac{\pi }{2}-} T_A(\alpha ) = 1-\lambda _2\) and \(T_A(\tfrac{\pi }{2}) = 1\),

  • if \(\lambda _1 = 1\) and \(0< \lambda _2 < 1\), then \(T_A(\alpha )\) is continuous on \(\left[ 0,\tfrac{\pi }{2}\right) \), but has a jump at \(\tfrac{\pi }{2}\), namely \(\lim _{\alpha \rightarrow \tfrac{\pi }{2}-} T_A(\alpha ) = \lambda _2\) and \(T_A(\tfrac{\pi }{2}) = 1\),

  • if \(\lambda _1,\lambda _2 \in (0,1)\), then \(T_A(\alpha )\) is continuous on \(\left[ 0,\tfrac{\pi }{2}\right] \),

  • if \(\lambda _1 \ne \lambda _2\), then we have \(T_A(0) = T_A(\tfrac{\pi }{2}) = 1\) and \(T_A(\alpha ) < 1\) for all \(0< \alpha < \tfrac{\pi }{2}\).

All of the above statements are rather straightforward computations using the formula (9), let us only show the last one here. Clearly, \(T_A(0) = T_A(\tfrac{\pi }{2}) = 1\) is obvious. As for the other assertion, if \(\lambda _1,\lambda _2 \in (0,1)\), then we can use the strict version of the weighted harmonic-arithmetic mean inequality:

$$\begin{aligned}&\frac{1}{ \tfrac{1}{\lambda _1} \cos ^2\alpha + \tfrac{1}{\lambda _2} \sin ^2\alpha } + \frac{1}{ \tfrac{1}{1-\lambda _1} \cos ^2\alpha + \tfrac{1}{1-\lambda _2} \sin ^2\alpha } \\&\quad< (\lambda _1 \cos ^2\alpha + \lambda _2 \sin ^2\alpha ) + ((1-\lambda _1) \cos ^2\alpha + (1-\lambda _2) \sin ^2\alpha ) = 1 \quad (0<\alpha <\tfrac{\pi }{2}). \end{aligned}$$

If \(\lambda _1 = 0< \lambda _2 < 1\), then we calculate in the following way:

$$\begin{aligned}&\frac{1}{ \left( \tfrac{1}{0}\right) \cos ^2\alpha + \tfrac{1}{\lambda _2} \sin ^2\alpha } + \frac{1}{ \cos ^2\alpha + \tfrac{1}{1-\lambda _2} \sin ^2\alpha }\\&\quad = \frac{1}{ 1-\sin ^2\alpha + \tfrac{1}{1-\lambda _2} \sin ^2\alpha }< 1 \quad (0<\alpha <\tfrac{\pi }{2}). \end{aligned}$$

The remaining cases are very similar.

The above observations together with Corollary 2.5 and (9) readily imply the following:

  • \(A\in \mathcal {SC}(H)\) if and only if \(B\in \mathcal {SC}(H)\),

  • \(A\in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\) if and only if \(B\in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\), in which case \(B \in \{A, A^\perp \}\),

  • there exists a \(P\in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\) and a \(t\in (0,1)\) with \(A\in \{tP, I-tP\}\) if and only if \(B\in \{tP, I-tP\}\),

  • \(\lambda _1,\lambda _2 \in (0,1)\) and \(\lambda _1\ne \lambda _2\) if and only if \(\mu _1, \mu _2 \in (0,1)\) and \(\mu _1\ne \mu _2\).

So what remained is to show that in the last case we further have \(B \in \{A, A^\perp \}\), which is what we shall do below.

Let us introduce the following functions:

$$\begin{aligned}&{\mathcal {T}}_A:[0,1]\rightarrow [0,1], \qquad {\mathcal {T}}_A(s) := T_A(\arcsin \sqrt{s}) = \frac{\lambda _1\lambda _2}{\lambda _1 s + \lambda _2 (1-s)} \\&\quad + \frac{(1-\lambda _1)(1-\lambda _2)}{(1-\lambda _1) s + (1-\lambda _2) (1-s)} \end{aligned}$$


$$\begin{aligned}&{\mathcal {T}}_B:[0,1]\rightarrow [0,1], \qquad {\mathcal {T}}_B(s) := T_B(\arcsin \sqrt{s}) = \frac{\mu _1\mu _2}{\mu _1 s + \mu _2 (1-s)}\\&\quad + \frac{(1-\mu _1)(1-\mu _2)}{(1-\mu _1) s + (1-\mu _2) (1-s)}. \end{aligned}$$

Our aim is to prove that \({\mathcal {T}}_A(s) = {\mathcal {T}}_B(s)\) (\(s\in [0,1]\)) implies either \(\lambda _1=\mu _1\) and \(\lambda _2=\mu _2\), or \(\lambda _1 + \mu _1 = \lambda _2 + \mu _2 =1\). The derivative of \({\mathcal {T}}_A\) is

$$\begin{aligned} \tfrac{d {\mathcal {T}}_A}{ds}(s) = (\lambda _1-\lambda _2)\left( \frac{-\lambda _1\lambda _2}{(\lambda _1 s + \lambda _2 (1-s))^2} + \frac{(1-\lambda _1)(1-\lambda _2)}{((1-\lambda _1) s + (1-\lambda _2) (1-s))^2}\right) , \end{aligned}$$

from which we calculate

$$\begin{aligned} \tfrac{d {\mathcal {T}}_A}{ds}(0) = -\frac{(\lambda _1-\lambda _2)^2}{(1-\lambda _2)\lambda _2} \quad \text {and} \quad \tfrac{d {\mathcal {T}}_A}{ds}(1) = \frac{(\lambda _1-\lambda _2)^2}{(1-\lambda _1)\lambda _1}. \end{aligned}$$

Therefore, if we managed to show that the function

$$\begin{aligned} F:(0,1)^2 \rightarrow {\mathbb {R}}^{2}, \quad F(x,y) = \left( \tfrac{(x-y)^2}{(1-x)x}, \tfrac{(x-y)^2}{(1-y)y} \right) \end{aligned}$$

is injective on the set \(\Delta := \{(x,y)\in {\mathbb {R}}^{2}:0<y<x<1 \}\), then we are done (note that \(F(x,y) = F(1-x,1-y)\)). For this assume that with some \(c,d>0\) we have

$$\begin{aligned} \frac{(x-y)^2}{(1-x)x} = \frac{1}{c} \quad \text {and} \quad \frac{(x-y)^2}{(1-y)y} = \frac{1}{d}, \end{aligned}$$

or equivalently,

$$\begin{aligned} (1-x)x = c(x-y)^2 \quad \text {and} \quad (1-y)y = d(x-y)^2. \end{aligned}$$

If we substitute \(u = \tfrac{x-y}{2}\) and \(v = \tfrac{x+y}{2}\), then we get

$$\begin{aligned} (u+v)^2-(u+v) = -4cu^2 \quad \text {and} \quad (v-u)^2-(v-u) = -4du^2. \end{aligned}$$

Now, considering the sum and difference of these two equations and manipulate them a bit gives

$$\begin{aligned} v^2-v = -(2c+2d+1)u^2 \quad \text {and} \quad v = (d-c) u + \tfrac{1}{2}. \end{aligned}$$

From these latter equations we conclude

$$\begin{aligned} x-y = 2u = \sqrt{\tfrac{1}{(d-c)^2+2c+2d+1}} \quad \text {and} \quad x+y = 2v = 2(d-c) u + 1, \end{aligned}$$

which clearly implies that F is globally injective on \(\Delta \), and the proof is complete. \(\square \)

We have an interesting consequence in finite dimensions.

Corollary 2.11

Assume that \(2\le \dim H < \infty \) and \(A,B \in {{\mathcal {E}}}(H)\). Then the following are equivalent:

  1. (i)

    \(\Lambda (A, P) + \Lambda (A^\perp , P) = \Lambda (B, P) + \Lambda (B^\perp , P)\) for all \(P\in P_1(H)\),

  2. (ii)

    \(A^\sim \cap {{\mathcal {F}}}_1(H) = B^\sim \cap {{\mathcal {F}}}_1(H)\),

  3. (iii)

    either \(A,B \in \mathcal {SC}(H)\), or \(A=B\), or \(A=B^\perp \).


The directions (i)\(\iff \)(ii)\(\Longleftarrow \)(iii) are trivial, so we shall only prove the (ii)\(\Longrightarrow \)(iii) direction. First, let us consider the two-dimensional case. As we saw in the proof of Lemma 2.10, we have \(A^\sim \cap {{\mathcal {F}}}_1(H) = {{\mathcal {F}}}_1(H)\) if and only if A is a scalar effect (see the first set of bullet points there). Therefore, without loss of generality we may assume that none of A and B are scalar effects. Notice that by Lemma 2.1, A and B commute with exactly the same rank-one projections, hence A and B possess the forms in (7) with some one-dimensional subspaces \(H_1\) and \(H_2\), and an easy application of Lemma 2.10 gives (iii).

As for the general case, since again A and B commute with exactly the same rank-one projections, we can jointly diagonalise them with respect to some orthonormal basis \(\{e_j\}_{j=1}^n\), where \(n = \dim H\):

$$\begin{aligned} A = \left[ \begin{matrix} \lambda _1 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} \lambda _2 &{} \dots &{} 0 &{} 0 \\ \vdots &{} &{} \ddots &{} &{} \vdots \\ 0 &{} 0 &{} \dots &{} \lambda _{n-1} &{} 0 \\ 0 &{} 0 &{} \dots &{} 0 &{} \lambda _n \\ \end{matrix}\right] \quad \text {and} \quad B = \left[ \begin{matrix} \mu _1 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} \mu _2 &{} \dots &{} 0 &{} 0 \\ \vdots &{} &{} \ddots &{} &{} \vdots \\ 0 &{} 0 &{} \dots &{} \mu _{n-1} &{} 0 \\ 0 &{} 0 &{} \dots &{} 0 &{} \mu _n \\ \end{matrix}\right] . \end{aligned}$$

Of course, for any two distinct \(i, j \in \{1,\dots , n\}\) we have the following equation for the strength functions:

$$\begin{aligned} \Lambda (A, P) + \Lambda (A^\perp , P) = \Lambda (B, P) + \Lambda (B^\perp , P) \qquad (P \in P_1({\mathbb {C}}\cdot e_i + {\mathbb {C}}\cdot e_j)), \end{aligned}$$

which instantly implies

$$\begin{aligned} \left[ \begin{matrix} \mu _i &{} 0 \\ 0 &{} \mu _j \end{matrix}\right] ^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}\cdot e_i + {\mathbb {C}}\cdot e_j) = \left[ \begin{matrix} \lambda _i &{} 0 \\ 0 &{} \lambda _j \end{matrix}\right] ^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}\cdot e_i + {\mathbb {C}}\cdot e_j). \end{aligned}$$

By the two-dimensional case this means that we have one of the following cases:

  • \(\lambda _i = \lambda _j\) and \(\mu _i = \mu _j\),

  • \(\lambda _i \ne \lambda _j\) and either \(\mu _i = \lambda _i\) and \(\mu _j = \lambda _j\), or \(\mu _i = 1-\lambda _i\) and \(\mu _j = 1-\lambda _j\).

From here it is easy to conclude (iii). \(\square \)

The commutant of an operator \(T\in {{\mathcal {B}}}(H)\) will be denoted by \(T' := \{ S\in {{\mathcal {B}}}(H) :ST=TS \}\), and more generally, if \({\mathcal {M}}\subseteq {{\mathcal {B}}}(H)\), then we set \({\mathcal {M}}' := \cap \{ T' :T\in {\mathcal {M}}\}\). We shall use the notations \(T'' := (T')'\) and \({\mathcal {M}}'' := ({\mathcal {M}}')'\) for the double commutants.

Lemma 2.12

For any \(A,B \in {{\mathcal {E}}}(H)\) the following three assertions hold:

  1. (a)

    If \(A^\sim \subseteq B^\sim \), then \(B\in A''\).

  2. (b)

    If \(\dim H \le \aleph _0\) and \(A^\sim \subseteq B^\sim \), then there exists a Borel function \(f:[0,1] \rightarrow [0,1]\) such that \(B = f(A)\).

  3. (c)

    If B is a convex combination of \(A, A^\perp , 0\) and I, then \(A^\sim \subseteq B^\sim \).


(a): Assume that \(C\in A'\). Our goal is to show \(B\in C'\). We express C in the following way:

$$\begin{aligned} C = C_{\mathfrak {R}} + i C_{\mathfrak {I}}, \; C_{\mathfrak {R}} = \frac{C+C^*}{2}, \; C_{\mathfrak {I}} = \frac{C-C^*}{2i} \end{aligned}$$

where \(C_{\mathfrak {R}}\) and \(C_{\mathfrak {I}}\) are self-adjoint (they are usually called the real and imaginary parts of C). Since A is self-adjoint, \(C^* \in A'\), hence \(C_{\mathfrak {R}}, C_{\mathfrak {I}} \in A'\). Let \(E_{\mathfrak {R}}\) and \(E_{\mathfrak {I}}\) denote the projection-valued spectral measures of \(C_{\mathfrak {R}}\) and \(C_{\mathfrak {I}}\), respectively. By the spectral theorem ([7, Theorem IX.2.2]), Lemma 2.1 and Corollary 2.4, we have \(E_{\mathfrak {R}}(\Delta ), E_{\mathfrak {I}}(\Delta ) \in A^c \subseteq A^\sim \subseteq B^\sim \), therefore also \(E_{\mathfrak {R}}(\Delta ), E_{\mathfrak {I}}(\Delta ) \in B'\) for all \(\Delta \in {{\mathcal {B}}}_{\mathbb {R}}\), which gives \(C\in B'\).

(b): This is an easy consequence of [7, Proposition IX.8.1 and Lemma IX.8.7].

(c): If \(A\sim C\), then also \(A^\perp , 0\) and \(I\sim C\). Hence by the convexity of \(C^\sim \) we obtain \(B\sim C\). \(\square \)

Now, we are in the position to prove our first main result.

Proof of Theorem 1.1

If H is separable, then the equivalence (ii)\(\iff \)(iii) is straightforward by Lemma 2.9. For general H the direction (i)\(\Longrightarrow \)(ii) is obvious, therefore we shall only prove (ii)\(\Longrightarrow \)(i), first in the separable, and then in the general case. By Lemma 2.1, we may assume throughout the rest of the proof that A and B are non-scalar effects. We will denote the spectral subspace of a self-adjoint operator T associated to a Borel set \(\Delta \subseteq {\mathbb {R}}\) by \(H_T(\Delta )\).

(ii)\(\Longrightarrow \)(i) in the separable case: We split this part into two steps.

STEP 1: Here, we establish two estimations, (11) and (12), for the strength functions of A and B on certain subspaces of H. Let \(\lambda _1, \lambda _2 \in \sigma (A), \lambda _1 \ne \lambda _2\) and \(0< \varepsilon < \tfrac{1}{2}|\lambda _1-\lambda _2|\). Then the spectral subspaces \(H_1 = H_A\left( (\lambda _1-\varepsilon , \lambda _1+\varepsilon ) \right) \) and \(H_2 = H_A\left( (\lambda _2-\varepsilon , \lambda _2+\varepsilon ) \right) \) are non-trivial and orthogonal. Set \(H_3\) to be the orthogonal complement of \(H_1\oplus H_2\), then the matrix of A written in the orthogonal decomposition \(H = H_1\oplus H_2 \oplus H_3\) is diagonal:

$$\begin{aligned} A = \left[ \begin{matrix} A_1 &{} 0 &{} 0 \\ 0 &{} A_2 &{} 0 \\ 0 &{} 0 &{} A_3 \end{matrix}\right] \in {{\mathcal {B}}}(H_1\oplus H_2 \oplus H_3). \end{aligned}$$

Note that \(H_3\) might be a trivial subspace. Since by Corollary 2.4A and B commute with exactly the same projections, the matrix of B in \(H = H_1\oplus H_2 \oplus H_3\) is also diagonal:

$$\begin{aligned} B = \left[ \begin{matrix} B_1 &{} 0 &{} 0 \\ 0 &{} B_2 &{} 0 \\ 0 &{} 0 &{} B_3 \end{matrix}\right] \in {{\mathcal {B}}}(H_1\oplus H_2 \oplus H_3). \end{aligned}$$

At this point, let us emphasise that of course \(H_j, A_j\) and \(B_j\) (\(j=1,2,3\)) all depend on \(\lambda _1, \lambda _2\) and \(\varepsilon \), but in order to keep our notation as simple as possible, we will stick with these symbols. However, if at any point it becomes important to point out this dependence, we shall use for instance \(B_j^{(\lambda _1, \lambda _2, \varepsilon )}\) instead of \(B_j\). Similar conventions apply later on.

Observe that by Corollary 2.8 we have

$$\begin{aligned} \left[ \begin{matrix} A_1 &{} 0 \\ 0 &{} A_2 \end{matrix}\right] ^\sim = \left[ \begin{matrix} B_1 &{} 0 \\ 0 &{} B_2 \end{matrix}\right] ^\sim . \end{aligned}$$

Now, we pick two arbitrary points \(\mu _1\in \sigma (B_1)\) and \(\mu _2\in \sigma (B_2)\). Then obviously, the following two subspaces are non-zero subspaces of \(H_1\) and \(H_2\), respectively:

$$\begin{aligned} {\widehat{H}}_1 := (H_1)_{B_1}\big ( (\mu _1-\varepsilon , \mu _1+\varepsilon ) \big ), \;\; {\widehat{H}}_2 := (H_2)_{B_2}\big ( (\mu _2-\varepsilon , \mu _2+\varepsilon ) \big ). \end{aligned}$$

Similarly as above, we have the following matrix forms where \({\check{H}}_j = H_j \ominus {\widehat{H}}_j\) \((j=1,2)\):

$$\begin{aligned} B_1 = \left[ \begin{matrix} {\widehat{B}}_1 &{} 0 \\ 0 &{} {\check{B}}_1 \end{matrix}\right] \in {{\mathcal {B}}}({\widehat{H}}_1 \oplus {\check{H}}_1) \;\; \text {and} \;\; B_2 = \left[ \begin{matrix} {\widehat{B}}_2 &{} 0 \\ 0 &{} {\check{B}}_2 \end{matrix}\right] \in {{\mathcal {B}}}({\widehat{H}}_2 \oplus {\check{H}}_2) \end{aligned}$$


$$\begin{aligned} A_1 = \left[ \begin{matrix} {\widehat{A}}_1 &{} 0 \\ 0 &{} {\check{A}}_1 \end{matrix}\right] \in {{\mathcal {B}}}({\widehat{H}}_1 \oplus {\check{H}}_1) \;\; \text {and} \;\; A_2 = \left[ \begin{matrix} {\widehat{A}}_2 &{} 0 \\ 0 &{} {\check{A}}_2 \end{matrix}\right] \in {{\mathcal {B}}}({\widehat{H}}_2 \oplus {\check{H}}_2). \end{aligned}$$

Note that \({\check{H}}_1\) or \({\check{H}}_2\) might be trivial subspaces. Again by Corollary 2.8, we have

$$\begin{aligned} \left[ \begin{matrix} {\widehat{A}}_1 &{} 0 \\ 0 &{} {\widehat{A}}_2 \end{matrix}\right] ^\sim = \left[ \begin{matrix} {\widehat{B}}_1 &{} 0 \\ 0 &{} {\widehat{B}}_2 \end{matrix}\right] ^\sim . \end{aligned}$$

Let us point out that by construction \(\sigma ({\widehat{A}}_j) \subseteq [\lambda _j-\varepsilon ,\lambda _j+\varepsilon ]\) and \(\sigma ({\widehat{B}}_j) \subseteq [\mu _j-\varepsilon ,\mu _j+\varepsilon ]\). Corollary 2.7 gives the following identity for the strength functions, where \({\widehat{I}}_j\) denotes the identity on \({\widehat{H}}_j\) \((j=1,2)\):

$$\begin{aligned}&\Lambda \left( \left[ \begin{matrix} {\widehat{A}}_1 &{} 0 \\ 0 &{} {\widehat{A}}_2 \end{matrix}\right] , P \right) + \Lambda \left( \left[ \begin{matrix} {\widehat{I}}_1 - {\widehat{A}}_1 &{} 0 \\ 0 &{} {\widehat{I}}_2 - {\widehat{A}}_2 \end{matrix}\right] , P \right) \nonumber \\&\quad = \Lambda \left( \left[ \begin{matrix} {\widehat{B}}_1 &{} 0 \\ 0 &{} {\widehat{B}}_2 \end{matrix}\right] , P \right) + \Lambda \left( \left[ \begin{matrix} {\widehat{I}}_1 - {\widehat{B}}_1 &{} 0 \\ 0 &{} {\widehat{I}}_2 - {\widehat{B}}_2 \end{matrix}\right] , P \right) \quad \left( \forall \; P\in {{\mathcal {P}}}_1\left( {\widehat{H}}_1\oplus {\widehat{H}}_2\right) \right) . \end{aligned}$$


$$\begin{aligned} \Theta :{\mathbb {R}}\rightarrow [0,1], \quad \Theta (t) = \left\{ \begin{matrix} 0 &{} \text {if} \; t< 0 \\ t &{} \text {if} \; 0 \le t \le 1 \\ 1&{} \text {if} \; 1 < t \end{matrix} \right. , \end{aligned}$$

and notice that we have the following two estimations for all rank-one projections P:

$$\begin{aligned}&\Lambda \left( \left[ \begin{matrix} \Theta (\lambda _1-\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (\lambda _2-\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \nonumber \\&\qquad + \Lambda \left( \left[ \begin{matrix} \Theta (1-\lambda _1-\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (1-\lambda _2-\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \nonumber \\&\quad \le \text {the expression in }(10) \nonumber \\&\quad \le \Lambda \left( \left[ \begin{matrix} \Theta (\lambda _1+\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (\lambda _2+\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \nonumber \\&\qquad + \Lambda \left( \left[ \begin{matrix} \Theta (1-\lambda _1+\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (1-\lambda _2+\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \end{aligned}$$


$$\begin{aligned}&\Lambda \left( \left[ \begin{matrix} \Theta (\mu _1-\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (\mu _2-\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \nonumber \\&\qquad + \Lambda \left( \left[ \begin{matrix} \Theta (1-\mu _1-\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (1-\mu _2-\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \nonumber \\&\quad \le \text {the expression in} (10) \nonumber \\&\quad \le \Lambda \left( \left[ \begin{matrix} \Theta (\mu _1+\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (\mu _2+\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) \nonumber \\&\qquad + \Lambda \left( \left[ \begin{matrix} \Theta (1-\mu _1+\varepsilon ){\widehat{I}}_1 &{} 0 \\ 0 &{} \Theta (1-\mu _2+\varepsilon ){\widehat{I}}_2 \end{matrix}\right] , P \right) . \end{aligned}$$

Note that the above estimations hold for any arbitrarily small \(\varepsilon \) and for all suitable choices of \(\mu _1\) and \(\mu _2\) (which of course depend on \(\varepsilon \)).

STEP 2: Here we show that \(B\in \{A,A^\perp \}\). Let us define the following set that depends only on \(\lambda _j\):

$$\begin{aligned} {\mathcal {C}}_j= & {} {\mathcal {C}}_j^{(\lambda _j)} := \bigcap \left\{ \sigma \left( B_j^{(\lambda _1, \lambda _2, \varepsilon )} \right) :0< \varepsilon< \tfrac{1}{2}|\lambda _1 - \lambda _2| \right\} \\= & {} \bigcap \left\{ \sigma \left( B|_{H_A((\lambda _j-\varepsilon ,\lambda _j+\varepsilon ))} \right) :0 < \varepsilon \right\} \quad (j=1,2). \end{aligned}$$

Notice that as this set is an intersection of monotonically decreasing (as \(\varepsilon \searrow 0\)), compact, non-empty sets, it must contain at least one element. Also, observe that if \(\mu _1\in {\mathcal {C}}_1\) and \(\mu _2\in {\mathcal {C}}_2\), then (11) and (12) hold for all \(\varepsilon >0\).

We proceed with proving that either \({\mathcal {C}}_1 = \{\lambda _1\}\) and \({\mathcal {C}}_2 = \{\lambda _2\}\), or \({\mathcal {C}}_1 = \{1-\lambda _1\}\) and \({\mathcal {C}}_2 = \{1-\lambda _2\}\) hold. Fix two arbitrary elements \(\mu _1 \in {\mathcal {C}}_1\) and \(\mu _2 \in {\mathcal {C}}_2\), and assume that neither \(\lambda _1 = \mu _1\) and \(\lambda _2 = \mu _2\), nor \(\lambda _1 + \mu _1 = \lambda _2 + \mu _2 = 1\) hold. From here our aim is to get a contradiction. As we showed in the proof of Lemma 2.10, there exists an \(\alpha _0 \in \left( 0,\tfrac{\pi }{2}\right) \) such that we have

$$\begin{aligned}&\frac{1}{ \left( \tfrac{1}{\lambda _1}\right) \cdot \cos ^2\alpha _0 + \left( \tfrac{1}{\lambda _2}\right) \cdot \sin ^2\alpha _0 } + \frac{1}{ \left( \tfrac{1}{1-\lambda _1}\right) \cdot \cos ^2\alpha _0 + \left( \tfrac{1}{1-\lambda _2}\right) \cdot \sin ^2\alpha _0 } \\&\quad \ne \frac{1}{ \left( \tfrac{1}{\mu _1}\right) \cdot \cos ^2\alpha _0 + \left( \tfrac{1}{\mu _2}\right) \cdot \sin ^2\alpha _0 } + \frac{1}{ \left( \tfrac{1}{1-\mu _1}\right) \cdot \cos ^2\alpha _0 + \left( \tfrac{1}{1-\mu _2}\right) \cdot \sin ^2\alpha _0 } \end{aligned}$$

where we interpret both sides as in (8). Notice that both summands on both sides depend continuously on \(\lambda _1,\lambda _2,\mu _1\) and \(\mu _2\). Therefore there exists an \(\varepsilon >0\) small enough and a rank-one projection \(P = P_{\cos \alpha _0 {\widehat{x}}_1+\sin \alpha _0 {\widehat{x}}_2}\), with \({\widehat{x}}_1\in {\widehat{H}}_1, {\widehat{x}}_2\in {\widehat{H}}_2, \Vert {\widehat{x}}_1\Vert =\Vert {\widehat{x}}_2\Vert =1\), such that the closed intervals bounded by the right- and left-hand sides of (11), and those of (12) are disjoint—which is a contradiction.

Observe that as we can do the above for any two disjoint elements of the spectrum \(\sigma (A)\), we can conclude that one of the following possibilities occur:

$$\begin{aligned} \{\lambda \} = \bigcap \left\{ \sigma \left( B|_{H_A((\lambda -\varepsilon , \lambda +\varepsilon ))} \right) :\varepsilon > 0 \right\} \qquad (\lambda \in \sigma (A)) \end{aligned}$$


$$\begin{aligned} \{1-\lambda \} = \bigcap \left\{ \sigma \left( B|_{H_A((\lambda -\varepsilon , \lambda +\varepsilon ))} \right) :\varepsilon > 0 \right\} \qquad (\lambda \in \sigma (A)). \end{aligned}$$

From here, we show that (13) implies \(A=B\), and (14) implies \(B=A^\perp \). As the latter can be reduced to the case (13), by considering \(B^\perp \) instead of B, we may assume without loss of generality that (13) holds. By Lemma 2.12 and [7, Theorem IX.8.10], there exists a function \(f\in L^\infty (\mu )\), where \(\mu \) is a scalar-valued spectral measure of A, such that \(B = f(A)\). Moreover, we have \(B = A\) if and only if \(f(\lambda ) = \lambda \) \(\mu \)-a.e, so we only have to prove the latter equation. Let us fix an arbitrarily small number \(\delta > 0\). By the spectral mapping theorem ( [7, Theorem IX.8.11]) and (13) we notice that for every \(\lambda \in \sigma (A)\) there exists an \(0< \varepsilon _{\lambda } < \delta \) such that

$$\begin{aligned} \mu -\mathrm {essran} \left( f|_{(\lambda -\varepsilon _{\lambda },\lambda +\varepsilon _{\lambda })} \right) = \sigma \left( B|_{H_A((\lambda -\varepsilon _{\lambda },\lambda +\varepsilon _{\lambda }))} \right) \subseteq (\lambda -\delta ,\lambda +\delta ), \end{aligned}$$

where \(\mu -\mathrm {essran}\) denotes the essential range of a function with respect to \(\mu \) (see [7, Example IX.2.6]). Now, for every \(\lambda \in \sigma (A)\) we fix such an \(\varepsilon _{\lambda }\). Clearly, the intervals \(\{(\lambda -\varepsilon _{\lambda },\lambda +\varepsilon _{\lambda }) :\lambda \in \sigma (A)\}\) cover the whole spectrum \(\sigma (A)\), which is a compact set. Therefore we can find finitely many of them, let’s say \(\lambda _1, \dots , \lambda _n\) so that

$$\begin{aligned} \sigma (A) \subseteq \bigcup _{j=1}^{n} \left( \lambda _j-\varepsilon _{\lambda _j},\lambda _j+\varepsilon _{\lambda _j} \right) . \end{aligned}$$

Finally, we define the function

$$\begin{aligned} h(\lambda ) = \lambda _j, \text { where } |\lambda -\lambda _{i}| \ge \varepsilon _{\lambda _i} \text { for all } 1\le i< j \text { and } |\lambda -\lambda _{j}| < \varepsilon _{\lambda _j}. \end{aligned}$$

By definition we have \(\Vert h - \mathrm {id}_{\sigma (A)} \Vert _{\infty } \le \delta \) where the \(\infty \)-norm is taken with respect to \(\mu \) and \(\mathrm {id}_{\sigma (A)}(\lambda ) = \lambda \) \((\lambda \in \sigma (A))\). But notice that by (15) we also have \(\Vert h - f \Vert _{\infty } \le \delta \), and hence \(\Vert f - \mathrm {id}_{\sigma (A)} \Vert _\infty \le 2\delta \). As this inequality holds for all positive \(\delta \), we actually get that \(f(\lambda ) = \lambda \) for \(\mu \)-a.e. \(\lambda \).

(ii)\(\Longrightarrow \)(i) in the non-separable case: It is well-known that there exists an orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\) such that each \(H_i\) is a non-trivial, separable, invariant subspace of A, see for instance [7, Proposition IX.4.4]. Since A and B commute with exactly the same projections, both are diagonal with respect to the decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\):

$$\begin{aligned} A = \oplus _{i\in \mathcal {I}} A_i \;\; \text {and} \;\; B = \oplus _{i\in \mathcal {I}} B_i \in {{\mathcal {E}}}(\oplus _{i\in \mathcal {I}} H_i). \end{aligned}$$

By Corollary 2.8 we have \(A_i^\sim = B_i^\sim \) for all \(i\in \mathcal {I}\), therefore the separable case implies

$$\begin{aligned} \text {either} \; A_i = B_i, \;\; \text {or} \; B_i = A_i^\perp , \;\; \text {or} \; A_i, B_i \in \mathcal {SC}(H_i) \qquad (i\in \mathcal {I}). \end{aligned}$$

Without loss of generality we may assume from now on that there exists an \(i_0\in \mathcal {I}\) so that \(A_{i_0}\) is not a scalar effect. (In case all of them are scalar, we simply combine two subspaces \(H_{i_1}\) and \(H_{i_2}\) so that \(\sigma (A_{i_1})\ne \sigma (A_{i_2})\)). This implies either \(A_{i_0} = B_{i_0}\), or \(B_{i_0} = A_{i_0}^\perp \). By considering \(B^\perp \) instead of B if necessary, we may assume from now on that \(A_{i_0} = B_{i_0}\) holds.

Finally, let \(i_1 \in \mathcal {I} \setminus \{i_0\}\) be arbitrary, and let us consider the orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}\setminus \{i_0,i_1\}} H_i \oplus K\) where \(K = H_{i_0}\oplus H_{i_1}\). Similarly as above, we obtain either \(A_{i_0}\oplus A_{i_1} = B_{i_0} \oplus B_{i_1}\), or \(B_{i_0}\oplus B_{i_1} = A_{i_0}^\perp \oplus A_{i_1}^\perp \), but since \(A_{i_0} = B_{i_0}\), we must have \(A_{i_1} = B_{i_1}\). As this holds for arbitrary \(i_1\), the proof is complete. \(\square \)

Now, we are in the position to give an alternative proof of Molnár’s theorem which also extends to the two-dimensional case.

Proof of Theorem 1.2 and Molnár’s theorem

By (a) of Lemma 2.1 and (\(\sim \)) we obtain \(\phi (\mathcal {SC}(H)) = \mathcal {SC}(H)\), moreover, the property (\(\le \)) implies the existence of a strictly increasing bijection \(g:[0,1] \rightarrow [0,1]\) such that \(\phi (\lambda I) = g(\lambda ) I\) for every \(\lambda \in [0,1]\). By Theorem 1.1 we conclude

$$\begin{aligned} \phi (A^\perp ) = \phi (A)^\perp \qquad (A\in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)). \end{aligned}$$

We only have to show that the same holds for scalar operators, because then the theorem is reduced to Ludwig’s theorem. For any effect A and any set of effects \({{\mathcal {S}}}\) let us define the following sets \(A^\le := \{B\in {{\mathcal {E}}}(H):A\le B\}\), \(A^\ge := \{B\in {{\mathcal {E}}}(H):A\ge B\}\) and \({{\mathcal {S}}}^\perp := \{B^\perp :B\in {{\mathcal {S}}}\}\). Observe that for any \(s,t\in [0,1]\) we have

$$\begin{aligned} \left( (sI)^\le \cap (tI)^\ge \setminus \mathcal {SC}(H)\right) ^\perp = (sI)^\le \cap (tI)^\ge \setminus \mathcal {SC}(H) \ne \emptyset \end{aligned}$$

if and only if \(t=1-s\) and \(s< \tfrac{1}{2}\). Thus for all \(s < \tfrac{1}{2}\) we obtain

$$\begin{aligned} \emptyset&\ne \left( (g(s)I)^\le \cap (g(1-s)I)^\ge \setminus \mathcal {SC}(H)\right) ^\perp = \phi \left( \left( (sI)^\le \cap ((1-s)I)^\ge \setminus \mathcal {SC}(H)\right) ^\perp \right) \\&= \phi \left( (sI)^\le \cap ((1-s)I)^\ge \setminus \mathcal {SC}(H)\right) = (g(s)I)^\le \cap (g(1-s)I)^\ge \setminus \mathcal {SC}(H), \end{aligned}$$

which by (16) implies \(g(1-s) = 1-g(s)\) and \(g(s)< \tfrac{1}{2}\), therefore we indeed have (\(\perp \)) for every effect. \(\square \)

3 Proof of Theorem 1.3 in Two Dimensions

In this section we prove our other main theorem for qubit effects. In order to do that we need to prove a few preparatory lemmas. We start with a characterisation of rank-one projections in terms of coexistence.

Lemma 3.1

For any \(A\in {{\mathcal {E}}}({\mathbb {C}}^2)\) the following are equivalent:

  1. (i)

    there are no effects \(B\in {{\mathcal {E}}}({\mathbb {C}}^2)\) such that \(B^\sim \subsetneq A^\sim \),

  2. (ii)

    \(A \in {{\mathcal {P}}}_1({\mathbb {C}}^2)\).


The case when \(A\in \mathcal {SC}({\mathbb {C}}^2)\) is trivial, therefore we may assume otherwise throughout the proof.

(i)\(\Longrightarrow \)(ii): Suppose that \(A \notin {{\mathcal {P}}}_1({\mathbb {C}}^2)\), then by Corollary 2.6 there exists an \(\varepsilon >0\) such that \(\{C\in {{\mathcal {E}}}({\mathbb {C}}^2) :C \le \varepsilon I\} \subseteq A^\sim \). Let \(B\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\cap A^c\), then we have \(B^\sim = B^c = A^c\subseteq A^\sim \). But it is very easy to find a \(C\in {{\mathcal {E}}}({\mathbb {C}}^2)\) such that \(C \le \varepsilon I\) and \(C\notin B^c\), therefore we conclude \(B^\sim \subsetneq A^\sim \).

(ii)\(\Longrightarrow \)(i): If \(A \in {{\mathcal {P}}}_1({\mathbb {C}}^2)\), \(B\in {{\mathcal {E}}}({\mathbb {C}}^2)\) and \(B^\sim \subsetneq A^\sim \), then also \(B^c\subsetneq A^c\), which is impossible. \(\square \)

Note that the above statement does not hold in higher dimensions, see the final section of this paper for more details. We continue with a characterisation of rank-one and ortho-rank-one qubit effects in terms of coexistence.

Lemma 3.2

Let \(A\in {{\mathcal {E}}}({\mathbb {C}}^2)\setminus \mathcal {SC}({\mathbb {C}}^2)\). Then the following are equivalent:

  1. (i)

    A or \(A^\perp \in {{\mathcal {F}}}_1({\mathbb {C}}^2)\setminus {{\mathcal {P}}}_1({\mathbb {C}}^2)\),

  2. (ii)

    There exists at least one \(B\in {{\mathcal {E}}}({\mathbb {C}}^2)\) such that \(B^\sim \subsetneq A^\sim \), and for every such pair of effects \(B_1, B_2\) we have either \(B_1^\sim \subseteq B_2^\sim \), or \(B_2^\sim \subseteq B_1^\sim \).

Moreover, if (i) holds, i.e. A or \(A^\perp = tP\) with \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(0<t<1\), then we have \(B^\sim \subseteq A^\sim \) if and only if B or \(B^\perp = sP\) with some \(t\le s\le 1\).


First, notice that by Theorem 1.1 and Lemma 2.12 (c) we have

$$\begin{aligned} (sP)^\sim \subseteq (tP)^\sim \quad \iff \quad t\le s \qquad (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2),\; t, s \in (0,1]). \end{aligned}$$

(i)\(\Longrightarrow \)(ii): If we have \(B^\sim \subseteq (tP)^\sim \) with some rank-one projection P, \(t \in (0,1]\) and qubit effect B, then by Lemma 2.12 (b) we obtain \(P\in B^c\) and \(B\notin \mathcal {SC}({\mathbb {C}}^2)\). Furthermore, since \(B^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2) \subseteq (tP)^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\), by Corollary 2.7 we obtain

$$\begin{aligned} T_B(\alpha ) \le T_{tP}(\alpha ) \qquad (0\le \alpha \le \tfrac{\pi }{2}), \end{aligned}$$

where we use the notation from the proof of Lemma 2.10. Thus, the discontinuity of \(T_{tP}(\alpha )\) at either \(\alpha = 0\), or \(\alpha = \tfrac{\pi }{2}\), implies the discontinuity of \(T_B(\alpha )\) at the same \(\alpha \). Whence we conclude either \(B=sP\), or \(B=I-sP\) with some \(t\le s\le 1\).

(ii)\(\Longrightarrow \)(i): By Lemma 3.1, (ii) cannot hold for elements of \({{\mathcal {P}}}_1({\mathbb {C}}^2)\), so we only have to check that if \(A, A^\perp \notin {{\mathcal {F}}}_1({\mathbb {C}}^2) \cup \mathcal {SC}({\mathbb {C}}^2)\), then (ii) fails. Suppose that the spectral decomposition of A is \(\lambda _1 P + \lambda _2 P^\perp \) where \(1> \lambda _1>\lambda _2 > 0\). Then by Lemma 2.12 (c) we find that \(\left( \lambda _1 P\right) ^\sim \subseteq A^\sim \) and \(\left( (1-\lambda _2) P^\perp \right) ^\sim \subseteq A^\sim \) (see Figure 1), but by the previous part neither \((\lambda _1 P)^\sim \subseteq \left( (1-\lambda _2) P^\perp \right) ^\sim \), nor \(\left( (1-\lambda _2) P^\perp \right) ^\sim \subseteq (\lambda _1 P)^\sim \) holds. \(\square \)

Fig. 1
figure 1

The figure shows all effects commuting with \(A \in {{\mathcal {E}}}({\mathbb {C}}^2)\setminus \mathcal {SC}({\mathbb {C}}^2)\), whose spectral decomposition is \(A = \lambda _1 P + \lambda _2 P^\perp \) with \(1>\lambda _1>\lambda _2>0\)

For a visualisation of \((tP)^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\) see Sect. 5. Before we proceed with the proof of Theorem 1.3 for qubit effects, we need a few more lemmas about rank-one projections acting on \({\mathbb {C}}^2\).

Lemma 3.3

For all \(P, Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) we have

$$\begin{aligned} \Vert P-Q\Vert ^2 = -\det (P-Q) = 1-\mathrm{tr}PQ = 1 - \Vert P^\perp -Q\Vert ^2. \end{aligned}$$


Since \(\mathrm{tr}(P-Q) = 0\), the eigenvalues of the self-adjoint operator \(P-Q\) are \(\lambda \) and \(-\lambda \) with some \(\lambda \ge 0\). Hence we have \(\Vert P-Q\Vert ^2 = -\det (P-Q)\). Applying a unitary similarity if necessary, we may assume without loss of generality that \((1,0) \in \mathrm{Im\,}P\). Obviously, there exist \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(0\le \mu \le 2\pi \) such that \((\cos \vartheta , e^{i\mu }\sin \vartheta ) \in \mathrm{Im\,}Q\). Thus the matrix forms of P and Q in the standard basis are

$$\begin{aligned} P = P_{(1,0)} = \left[ \begin{matrix} 1 \\ 0 \end{matrix} \right] \cdot \left[ \begin{matrix} 1 \\ 0 \end{matrix} \right] ^* = \left[ \begin{matrix} 1 &{} 0 \\ 0 &{} 0 \end{matrix} \right] \end{aligned}$$


$$\begin{aligned} Q = P_{(\cos \vartheta , e^{i\mu }\sin \vartheta )} = \left[ \begin{matrix} \cos \vartheta \\ e^{i\mu }\sin \vartheta \end{matrix} \right] \cdot \left[ \begin{matrix} \cos \vartheta \\ e^{i\mu }\sin \vartheta \end{matrix} \right] ^* = \left[ \begin{matrix} \cos ^2\vartheta &{} e^{-i\mu }\cos \vartheta \sin \vartheta \\ e^{i\mu }\cos \vartheta \sin \vartheta &{} \sin ^2\vartheta \end{matrix} \right] , \end{aligned}$$

where we used the notation of the Busch–Gudder theorem. Now, an easy calculation gives us \(\det (P-Q) = -\sin ^2\vartheta \) and \(\mathrm{tr}PQ = \cos ^2\vartheta \). Hence the second equation in (17) is proved, and the third one follows from \(\mathrm{tr}P^\perp Q = 1 - \mathrm{tr}PQ\). \(\square \)

For \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(s\in [0,1]\), let us use the following notation:

$$\begin{aligned} {\mathcal {M}}_{P,s} := \left\{ Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2) :\Vert P-Q\Vert = s \right\} . \end{aligned}$$

Next, we examine this set.

Lemma 3.4

For all \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) the following statements are equivalent:

  1. (i)

    \(s = \sin \tfrac{\pi }{4}\),

  2. (ii)

    there exists an \(R\in {\mathcal {M}}_{P,s}\) such that \(R^\perp \in {\mathcal {M}}_{P,s}\),

  3. (iii)

    for all \(R\in {\mathcal {M}}_{P,s}\) we have also \(R^\perp \in {\mathcal {M}}_{P,s}\).


One could use the Bloch representation (see Sect. 5), however, let us give here a purely linear algebraic proof. Note that for any \(R_1, R_2 \in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) we have \(\Vert R_1-R_2\Vert = 1\) if and only if \(R_2 = R_1^\perp \). Without loss of generality we may assume that P has the matrix form of (18). Then for any \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(R_1, R_2 \in {\mathcal {M}}_{P,\sin \vartheta }\) we have

$$\begin{aligned} R_1= & {} \left[ \begin{matrix} \cos ^2\vartheta &{} e^{-i\mu _1}\cos \vartheta \sin \vartheta \\ e^{i\mu _1}\cos \vartheta \sin \vartheta &{} \sin ^2\vartheta \end{matrix} \right] \quad \text {and}\\ R_2= & {} \left[ \begin{matrix} \cos ^2\vartheta &{} e^{-i\mu _2}\cos \vartheta \sin \vartheta \\ e^{i\mu _2}\cos \vartheta \sin \vartheta &{} \sin ^2\vartheta \end{matrix} \right] \end{aligned}$$

with some \(\mu _1,\mu _2\in {\mathbb {R}}\). Hence, we get

$$\begin{aligned} \Vert R_1-R_2\Vert&= \sqrt{1-\mathrm{tr}R_1R_2} = \sqrt{\sin ^2\vartheta \cos ^2\vartheta (2-e^{i(\mu _1-\mu _2)}-e^{i(\mu _2-\mu _1)})} \\&= |e^{i\mu _1}-e^{i\mu _2}| \cos \vartheta \sin \vartheta = \tfrac{1}{2} |e^{i\mu _1}-e^{i\mu _2}| \sin (2 \vartheta ). \end{aligned}$$

Notice that the right-hand side is always less than or equal to 1. Moreover, for any \(\mu _1\in {\mathbb {R}}\) there exist a \(\mu _2\in {\mathbb {R}}\) such that \(\Vert R_1-R_2\Vert =1\) if and only if \(\vartheta = \tfrac{\pi }{4}\). This completes the proof. \(\square \)

Lemma 3.5

Let \(P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(s,t\in (0,1)\). Then the following are equivalent:

  1. (i)

    \(tP\sim sQ\)

  2. (ii)

    either \(Q=P\), or \(Q=P^\perp \), or

    $$\begin{aligned} s \le \frac{1}{\tfrac{1}{1-t}\Vert P^\perp -Q\Vert ^2+\Vert P-Q\Vert ^2}. \end{aligned}$$


The case when \(Q\in \{P,P^\perp \}\) is trivial, so from now on we assume otherwise. Recall that two rank-one effects with different images are coexistent if and only if their sum is an effect, see [20, Lemma 2]. Therefore, (i) is equivalent to \(I-tP-sQ \ge 0\). Since \(\mathrm{tr}(I-tP-sQ) = 2-t-s > 0\), the latter is further equivalent to \(\det (I-tP-sQ) \ge 0\). Without loss of generality we may assume that P and Q have the matrix forms written in (18) and (19) with \(0<\vartheta <\tfrac{\pi }{2}\). Then a calculation gives

$$\begin{aligned} \det (I-tP-sQ) = s(t-1)\sin ^2\vartheta - s\cos ^2\vartheta +1-t = 1-t-s+ts\Vert P-Q\Vert ^2. \end{aligned}$$

From the latter we get that \(\det (I-tP-sQ) \ge 0\) holds if and only if

$$\begin{aligned} s\le \frac{1-t}{1-t\Vert P-Q\Vert ^2}, \end{aligned}$$

which, by (17) is equivalent to (ii). \(\square \)

Note that we have

$$\begin{aligned} 0< \frac{1}{\tfrac{1}{1-t}\Vert P^\perp -Q\Vert ^2+\Vert P-Q\Vert ^2} < 1 \qquad (t\in (0,1), P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2), Q\notin \{P,P^\perp \}). \end{aligned}$$

We need one more lemma.

Lemma 3.6

Let \(P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\). Then there exists a projection \(R\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) such that

$$\begin{aligned} \Vert P-R\Vert = \Vert Q-R\Vert = \sin \tfrac{\pi }{4}. \end{aligned}$$


Again, one could use the Bloch representation, however, let us give here a purely linear algebraic proof. We may assume without loss of generality that P and Q are of the form (18) and (19). Then for any \(z\in {\mathbb {C}}\), \(|z|=1\) the rank-one projection

$$\begin{aligned} R = \frac{1}{\sqrt{2}}\left[ \begin{matrix} 1 \\ z \end{matrix} \right] \cdot \left( \frac{1}{\sqrt{2}}\left[ \begin{matrix} 1\\ z \end{matrix} \right] \right) ^* = \frac{1}{2}\left[ \begin{matrix} 1 &{} {\overline{z}} \\ z &{} 1 \end{matrix} \right] \end{aligned}$$

satisfies \(\Vert P-R\Vert = \sin \tfrac{\pi }{4}\). In order to complete the proof we only have to find a z with \(|z|=1\) such that \(\mathrm{tr}RQ = \tfrac{1}{2}\), which is an easy calculation. Namely, we find that \(z=ie^{i\mu }\) is a suitable choice. \(\square \)

Now, we are in the position to prove our second main result in the low-dimensional case.

Proof of Theorem 1.3 in two dimensions

The proof is divided into the following three steps:

  1. 1

    we show some basic properties of \(\phi \), in particular, that it preserves commutativity in both directions,

  2. 2

    we show that \(\phi \) maps pairs of rank-one projections with distance \(\sin \tfrac{\pi }{4}\) into pairs of rank-one projections with the same distance,

  3. 3

    we finish the proof by examining how \(\phi \) acts on rank-one projections and rank-one effects.

STEP 1: First of all, the properties of \(\phi \) imply

$$\begin{aligned} \phi (A)^\sim = \phi (A^\sim ) \quad (A\in {{\mathcal {E}}}({\mathbb {C}}^2)), \end{aligned}$$


$$\begin{aligned} B^\sim \subseteq A^\sim \;\;\iff \;\; \phi (B)^\sim \subseteq \phi (A)^\sim \quad (A,B\in {{\mathcal {E}}}({\mathbb {C}}^2)). \end{aligned}$$

Hence, it is straightforward from Lemma 2.1 that there exists a bijection \(g:[0,1] \rightarrow [0,1]\) such that

$$\begin{aligned} \phi (t I) = g(t)I \quad (t\in [0,1]). \end{aligned}$$

Also, by Lemma 3.1 we easily infer

$$\begin{aligned} \phi ({{\mathcal {P}}}_1({\mathbb {C}}^2)) = {{\mathcal {P}}}_1({\mathbb {C}}^2), \end{aligned}$$

thus, in particular, we get

$$\begin{aligned} \phi (P^c) = \phi (P^\sim ) = \phi (P)^\sim = \phi (P)^c \quad (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

By Theorem 1.1 we also obtain

$$\begin{aligned} \phi (A^\perp ) = \phi (A)^\perp \quad (A\in {{\mathcal {E}}}({\mathbb {C}}^2)\setminus \mathcal {SC}({\mathbb {C}}^2)). \end{aligned}$$

Now, we observe that \(\phi \) preserves commutativity in both directions. Indeed we have the following for every \(A,B \in {{\mathcal {E}}}({\mathbb {C}}^2)\setminus \mathcal {SC}({\mathbb {C}}^2)\):

$$\begin{aligned} AB =&BA \,\iff \, A^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2) = B^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2) = \{P,P^\perp \} \text { for some } P\in {{\mathcal {P}}}_1({\mathbb {C}}^2) \\&\,\iff \, \phi (A)^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2) = \phi (B)^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2)\\ =&\{Q,Q^\perp \} \text { for some } Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2) \\&\,\iff \, \phi (A)\phi (B) = \phi (B)\phi (A). \end{aligned}$$

Note that we easily get the same conclusion using (20) if any of the two effects is a scalar effect.

Next, notice that Lemma 3.2 implies

$$\begin{aligned} A \,\text {or}\, A^\perp \in {{\mathcal {F}}}_1({\mathbb {C}}^2)\setminus {{\mathcal {P}}}_1({\mathbb {C}}^2) \;\;\iff \;\; \phi (A) \,\text {or}\, \phi (A)^\perp \in {{\mathcal {F}}}_1({\mathbb {C}}^2)\setminus {{\mathcal {P}}}_1({\mathbb {C}}^2). \end{aligned}$$

Therefore, by interchanging the \(\phi \)-images of tP and \(I-tP\) for some \(0<t<1\) and \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\), we may assume without loss of generality that

$$\begin{aligned} \phi \left( {{\mathcal {F}}}_1({\mathbb {C}}^2)\setminus {{\mathcal {P}}}_1({\mathbb {C}}^2)\right) = {{\mathcal {F}}}_1({\mathbb {C}}^2)\setminus {{\mathcal {P}}}_1({\mathbb {C}}^2). \end{aligned}$$

Hence we obtain the following for all rank-one projections P:

$$\begin{aligned} \phi \left( \{tP, tP^\perp :0<t\le 1\}\right)= & {} \phi \left( P^c\cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\right) = \phi (P)^c\cap {{\mathcal {F}}}_1({\mathbb {C}}^2) \\= & {} \{t\phi (P), t\phi (P)^\perp :0<t\le 1\}. \end{aligned}$$

Thus, again by interchanging the \(\phi \)-images of P and \(P^\perp \) for some \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\), and using Lemma 3.2, we may assume without loss of generality that for every \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) there exists a strictly increasing bijective map \(f_P:(0,1]\rightarrow (0,1]\) such that

$$\begin{aligned} \phi (tP) = f_P(t) \phi (P) \quad (0<t\le 1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

STEP 2: We define the following set for any qubit effect of the form tP, \(0<t<1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\):

$$\begin{aligned} \ell _{tP} := \left\{ \frac{1}{ \tfrac{1}{1-t} \Vert P^\perp -Q\Vert ^2 + \Vert P-Q\Vert ^2 } Q :Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\setminus \{P, P^\perp \} \right\} . \end{aligned}$$

(For a visualisation of \(\ell _{tP}\) see Sect. 5.) Using Lemma 3.5 we see that

$$\begin{aligned} \ell _{tP} = \big ((tP)^\sim \setminus \cup \{(sP)^\sim :t<s<1\}\big ) \cap {{\mathcal {F}}}_1({\mathbb {C}}^2) \qquad (0<t<1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

By the properties of \(\phi \) we obtain

$$\begin{aligned} \phi (\ell _{tP}) = \ell _{\phi (tP)} = \ell _{f_P(t)\phi (P)} \qquad (0<t<1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

Next, using the set introduced in (22), we prove the following property of \(\phi \):

$$\begin{aligned} \Vert P-Q\Vert = \sin \tfrac{\pi }{4} \;\iff \; \Vert \phi (P)-\phi (Q)\Vert = \sin \tfrac{\pi }{4} \qquad (P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

By a straightforward calculation we get that

$$\begin{aligned} \ell _{tP}\cap \ell _{rP^\perp }= & {} \left\{ \frac{1-t}{1-t \cdot s(t,r)^2}Q :Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2), \Vert P-Q\Vert = s(t,r) \right\} \\&(t,r\in (0,1), P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)) \end{aligned}$$


$$\begin{aligned} s(t,r) := \sqrt{\frac{\tfrac{t}{1-t}}{\tfrac{t}{1-t}+\tfrac{r}{1-r}}}. \end{aligned}$$

Note that \(s(t,r) = \sin \tfrac{\pi }{4}\) holds if and only if \(t= r\). By Lemma 3.4, this is further equivalent to the following:

$$\begin{aligned} \forall \; A_1 \in \ell _{tP}\cap \ell _{rP^\perp }, \; \exists \, A_2 \in \ell _{tP}\cap \ell _{rP^\perp }, A_1\ne A_2:(A_1)^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2) = (A_2)^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2). \end{aligned}$$

Notice that by (23) this is equivalent to the following:

$$\begin{aligned}&\forall \; B_1 \in \ell _{f_P(t)\phi (P)}\cap \ell _{f_{P^\perp }(r)\phi (P)^\perp }, \; \exists \, B_2 \in \ell _{f_P(t)\phi (P)}\cap \ell _{f_{P^\perp }(r)\phi (P)^\perp }, B_1\ne B_2:\\&\quad (B_1)^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2) = (B_2)^\sim \cap {{\mathcal {P}}}_1({\mathbb {C}}^2), \end{aligned}$$

which is further equivalent to \(f_P(t) = f_{P^\perp }(r)\).

Hence we can conclude a few important properties of \(\phi \). First, we have

$$\begin{aligned} f_P(t) = f_{P^\perp }(t) \quad (0< t \le 1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

Second, since for every \(0<t<1\) and \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) we have

$$\begin{aligned}&\left\{ f_Q\left( \tfrac{1-t}{1-t/2}\right) \phi (Q) :Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2), \Vert P-Q\Vert = \sin \tfrac{\pi }{4} \right\} = \phi \left( \ell _{tP}\cap \ell _{tP^\perp }\right) \\&\quad = \ell _{f_P(t)\phi (P)}\cap \ell _{f_P(t)\phi (P)^\perp } = \left\{ \tfrac{1-f_P(t)}{1-f_P(t)/2}R :R\in {{\mathcal {P}}}_1({\mathbb {C}}^2), \Vert \phi (P)-R\Vert = \sin \tfrac{\pi }{4} \right\} , \end{aligned}$$

therefore using (21) gives (24).

Furthermore, we also obtain

$$\begin{aligned} f_Q\left( \tfrac{1-t}{1-t/2}\right) = \tfrac{1-f_P(t)}{1-f_P(t)/2} \qquad \left( 0< t < 1, P,Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2), \Vert P-Q\Vert = \sin \tfrac{\pi }{4}\right) . \end{aligned}$$

By Lemma 3.6, for all \(Q_1, Q_2\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) there exists a rank-one projection P such that

$$\begin{aligned} \Vert Q_1-P\Vert = \Vert Q_2-P\Vert = \sin \tfrac{\pi }{4}. \end{aligned}$$

Therefore, applying (25) and noticing that \(t\mapsto \tfrac{1-t}{1-t/2}\) is a strictly decreasing bijection of (0, 1) gives that

$$\begin{aligned} f_{Q_1}(t) = f_{Q_2}(t) \quad (t\in (0,1), Q_1, Q_2\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

Thus we conclude that there exists a strictly increasing bijection \(f:(0,1]\rightarrow (0,1]\) such that

$$\begin{aligned} \phi (tP) = f(t) \phi (P) \quad (0<t\le 1, P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

We also observe that (25) implies

$$\begin{aligned} f\left( \tfrac{1-t}{1-t/2} \right) = \tfrac{1-f(t)}{1-f(t)/2}, \end{aligned}$$

therefore we notice that

$$\begin{aligned} f\left( 2-\sqrt{2} \right) = 2-\sqrt{2}, \end{aligned}$$

which is a consequence of the fact that the unique solution of the equation \(t = \tfrac{1-t}{1-t/2}\), \(0<t<1\), is \(t = 2-\sqrt{2}\).

STEP 3: Next, applying [12, Theorem 2.3] gives that there exists a unitary or antiunitary operator \(U:{\mathbb {C}}^2\rightarrow {\mathbb {C}}^2\) such that we have

$$\begin{aligned} U^*\phi (P)U \in \{P,P^\perp \} \quad (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

Since either both \(U^*\phi (\cdot )U\) and \(\phi (\cdot )\) satisfy our assumptions simultaneously, or none of them does, therefore without loss of generality we may assume that we have

$$\begin{aligned} \phi (P) \in \{P,P^\perp \} \quad (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

We now claim that

$$\begin{aligned} \text {either}\; \phi (P) = P \;\; (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)), \; \text {or} \; \phi (P) = P^\perp \;\; (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)). \end{aligned}$$

Let us assume otherwise, then there exist two rank-one projections P and Q such that \(\Vert P-Q\Vert < \sin \tfrac{\pi }{4}\), \(\phi (P) = P\) and \(\phi (Q) = Q^\perp \). Note that \(\Vert P-Q^\perp \Vert = \sqrt{1-\Vert P-Q\Vert ^2}> \sin \tfrac{\pi }{4} > \Vert P-Q\Vert \). By (23) and (28) we have

$$\begin{aligned}&\left\{ \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-R\Vert ^2} R:R\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\setminus \{P,P^\perp \} \right\} = \ell _{(2-\sqrt{2})P} = \phi \left( \ell _{(2-\sqrt{2})P}\right) \nonumber \\&\quad = \left\{ f\left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-R\Vert ^2}\right) \phi (R):R\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\setminus \{P,P^\perp \} \right\} . \end{aligned}$$

Therefore putting first \(R=Q\) and then \(R=Q^\perp \) gives

$$\begin{aligned} \phi \left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q\Vert ^2} Q \right)&= f\left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q\Vert ^2}\right) \phi (Q) \\&= f\left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q\Vert ^2}\right) Q^\perp = \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q^\perp \Vert ^2} Q^\perp \end{aligned}$$


$$\begin{aligned} \phi \left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q^\perp \Vert ^2} Q^\perp \right)&= f\left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q^\perp \Vert ^2}\right) \phi (Q^\perp ) \\&= f\left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q^\perp \Vert ^2}\right) Q = \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2})\Vert P-Q\Vert ^2} Q. \end{aligned}$$

But this implies that f interchanges two different numbers which contradicts to its strict increasingness—proving our claim (29).

Note that for every \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(0\le \mu < 2\pi \) we have

$$\begin{aligned} (P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )})^\perp= & {} \left[ \begin{matrix} \sin ^2\vartheta &{} -e^{-i\mu } \cos \vartheta \sin \vartheta \\ -e^{i\mu } \cos \vartheta \sin \vartheta &{} \cos ^2\vartheta \\ \end{matrix}\right] \\= & {} \left[ \begin{matrix} 0 &{} 1 \\ -1 &{} 0 \\ \end{matrix}\right] (P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )})^t \left[ \begin{matrix} 0 &{} 1 \\ -1 &{} 0 \\ \end{matrix}\right] ^* \end{aligned}$$

where \(\cdot ^t\) stands for the transposition, and we used the notation of the Busch–Gudder theorem. It is well-known, and can be verified by an easy computation, that we have \(A^t = KAK^*\) for every qubit effect A, where K is the coordinate-wise conjugation antiunitary operator: \(K(z_1,z_2) = (\overline{z_1},\overline{z_2})\) \((z_1,z_2\in {\mathbb {C}})\). Therefore from now on we may assume without loss of generality that we have

$$\begin{aligned} \phi (P) = P \quad (P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)), \end{aligned}$$

i.e. \(\phi \) fixes all rank-one projections.

Finally, observe that (30) and (31) implies

$$\begin{aligned} f\left( \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2}) \tau }\right) = \tfrac{\sqrt{2}-1}{1-(2-\sqrt{2}) \tau } \quad (0<\tau <1), \end{aligned}$$

thus we obtain \(\phi (tP) = tP\) for all \(P\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\) and \(\sqrt{2}-1< t < 1\). But this further implies

$$\begin{aligned}&\left\{ \tfrac{1-t}{1-t\Vert P-Q\Vert ^2} Q:Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\setminus \{P,P^\perp \} \right\} = \ell _{tP} = \phi \left( \ell _{tP}\right) \\&\quad = \left\{ f\left( \tfrac{1-t}{1-t\Vert P-Q\Vert ^2}\right) Q:Q\in {{\mathcal {P}}}_1({\mathbb {C}}^2)\setminus \{P,P^\perp \} \right\} \end{aligned}$$

for all \(\sqrt{2}-1< t < 1\), from which we conclude

$$\begin{aligned} \phi (tP) = tP \quad (0<t<1), \end{aligned}$$

i.e. \(\phi \) fixes all rank-one effects. From here we only need to apply Corollary 2.11 and transform back to our original \(\phi \) to complete the proof. \(\square \)

4 Proof of Theorem 1.3 in the General Case

Here we prove the general case of our main theorem, utilising the above proved low-dimensional case. We start with two lemmas.

Lemma 4.1

Let \(P \in {{\mathcal {P}}}(H)\setminus \mathcal {SC}(H)\) and \(A \in {{\mathcal {E}}}(H) \setminus \{ P, P^\perp \}\). Then there exists a rank-one effect \(R\in {{\mathcal {F}}}_1(H)\) such that \(R \sim A\) but \(R \not \sim P\).


Assume that \(A \in {{\mathcal {E}}}(H)\) such that \(A^\sim \cap {{\mathcal {F}}}_1(H) \subseteq P^\sim = P^c\) holds. We have to show that then either \(A=P\), or \(A=P^\perp \). Clearly, A is not a scalar effect. By Corollary 2.7 we obtain that

$$\begin{aligned} \Lambda (A,Q) + \Lambda (A^\perp ,Q) \le \Lambda (P,Q) + \Lambda (P^\perp ,Q) \qquad (Q\in {{\mathcal {P}}}_1(H)). \end{aligned}$$

Notice that the set

$$\begin{aligned} \mathrm{supp}\left( \Lambda (P,\cdot ) + \Lambda (P^\perp ,\cdot )\right) := \left\{ Q\in {{\mathcal {P}}}_1(H) :\Lambda (P,Q) + \Lambda (P^\perp ,Q) > 0 \right\} \end{aligned}$$

has two connected components (with respect to the operator norm topology), namely

$$\begin{aligned} \left\{ Q\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}Q \subset \mathrm{Im\,}P \right\} \;\; \text {and} \;\; \left\{ Q\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}Q \subset \mathrm{Ker}P \right\} . \end{aligned}$$

However, by the Busch–Gudder theorem we obtain that

$$\begin{aligned}&\left\{ Q\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}Q \subset \mathrm{Im\,}A \cup \mathrm{Im\,}(I-A) \right\} \\&\quad \subseteq \left\{ Q\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}Q \subset \mathrm{Im\,}A^{1/2} \cup \mathrm{Im\,}(I-A)^{1/2} \right\} \\&\quad \subseteq \mathrm{supp}\left( \Lambda (A,\cdot ) + \Lambda (A^\perp ,\cdot )\right) \subseteq \mathrm{supp}\left( \Lambda (P,\cdot ) + \Lambda (P^\perp ,\cdot )\right) . \end{aligned}$$

Since \(\mathrm{supp}\left( \Lambda (P,\cdot ) + \Lambda (P^\perp ,\cdot )\right) \) is a closed set, we obtain

$$\begin{aligned}&\left\{ Q\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}Q \subset (\mathrm{Im\,}A)^- \cup (\mathrm{Im\,}(I-A))^- \right\} \nonumber \\&\quad = \left\{ Q\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}Q \subset (\mathrm{Ker}A)^\perp \cup (\mathrm{Ker}(I-A))^\perp \right\} \nonumber \\&\quad \subseteq \mathrm{supp}\left( \Lambda (P,\cdot ) + \Lambda (P^\perp ,\cdot )\right) . \end{aligned}$$

Notice that the left-hand side of (34) is connected if and only if A is not a projection, in which case it must be a subset of one of the components of the right-hand side. However, this is impossible because the left-hand side contains a maximal set of pairwise orthogonal rank-one projections. Therefore \(A\in {{\mathcal {P}}}(H)\), and in particular \(\mathrm{supp}\left( \Lambda (A,\cdot ) + \Lambda (A^\perp ,\cdot )\right) \) has two connected components. From here using (33) for both A and P we easily complete the proof. \(\square \)

We introduce a new relation on \({{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\). For \(A,B \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\) we write \(A \prec B\) if and only if for every \(C\in A^\sim \setminus \mathcal {SC}(H)\) there exists a \(D\in B^\sim \setminus \mathcal {SC}(H)\) such that \(C^\sim \subseteq D^\sim \). Clearly, for every non-scalar effect B we have \(B \prec B\) and \(B^\perp \prec B\). In particular \(\prec \) is a reflexive relation, but it is not antisymmetric. It is also straightforward from the definition that \(\prec \) is a transitive relation, i.e. \(A \prec B\) and \(B \prec C\) imply \(A \prec C\).

We proceed with characterising non-trivial projections in terms of the relation of coexistence.

Lemma 4.2

Assume that \(A \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\). Then the following two statements are equivalent:

  1. (i)

    \(A \in {{\mathcal {P}}}(H)\),

  2. (ii)

    \(\# \{ B \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H) :B \prec A \} = 2\).


(i)\(\Longrightarrow \)(ii): Suppose that \(B \in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\), \(B\ne A\), \(B\ne A^\perp \) and \(B \prec A\). We need to show that this assumption leads to a contradiction. By Lemma 4.1 there exists a rank one effect tQ, with some \(Q\in {{\mathcal {P}}}_1(H)\) and \(t \in (0,1]\), such that \(tQ \sim B\) but \(tQ\not \sim A\). From \(B \prec A\) we know that there exists a non-scalar effect D such that

$$\begin{aligned} (tQ)^\sim \subseteq D^\sim \ \ \ \mathrm{and} \ \ \ D \sim A. \end{aligned}$$

By Lemma 2.12 (a) we have

$$\begin{aligned} D \in (tQ)''\cap {{\mathcal {E}}}(H) = Q''\cap {{\mathcal {E}}}(H) = \left\{ sQ + r Q^\perp \in {{\mathcal {E}}}(H) :s,r \in [0,1] \right\} , \end{aligned}$$

where the latter equation is easy to see (even in non-separable Hilbert spaces). Since we also have \(D\in A^c\), we obtain \(Q\in A^c\), hence the contradiction \(tQ\in A^c = A^\sim \).

(ii)\(\Longrightarrow \)(i): Here we use contraposition, so let us assume that \(A \in \left( {{\mathcal {E}}}(H)\setminus {{\mathcal {P}}}(H)\right) \setminus \mathcal {SC}(H)\). We shall construct a non-trivial projection P (which is obviously different from both A and \(A^\perp \)) such that \(P\prec A\). First, notice that there exists an \(0< \varepsilon < \tfrac{1}{2}\) such that \(H_A\left( \left( \varepsilon , 1-\varepsilon \right] \right) \notin \left\{ \{0\}, H \right\} \). Indeed, otherwise an elementary examination of the spectrum gives that \(\sigma (A) \subseteq \{\varepsilon _0,1-\varepsilon _0\}\) holds with some \(0< \varepsilon _0 < \tfrac{1}{2}\). As A is non-scalar, we actually get \(\sigma (A) = \{\varepsilon _0,1-\varepsilon _0\}\), which implies that \(H_A\left( \left( \varepsilon _0, 1-\varepsilon _0 \right] \right) \) is a non-trivial subspace.

Let us now consider the orthogonal decomposition \(H = H_1 \oplus H_2 \oplus H_3\) where

$$\begin{aligned} H_1 = H_A\left( \left[ 0, \varepsilon \right] \right) , \;\; H_2 = H_A\left( \left( \varepsilon , 1-\varepsilon \right] \right) \;\; \text {and} \;\; H_3 = H_A\left( \left( 1-\varepsilon , 1 \right] \right) . \end{aligned}$$

With respect to this orthogonal decomposition we have

$$\begin{aligned} A = \left[ \begin{matrix} A_1 &{} 0 &{} 0 \\ 0 &{} A_2 &{} 0 \\ 0 &{} 0 &{} A_3 \end{matrix} \right] \in {{\mathcal {E}}}(H_1 \oplus H_2 \oplus H_3). \end{aligned}$$

Since coexistence is invariant under taking the ortho-complements, we may assume without loss of generality that \(H_3\ne \{0\}\). Let us set

$$\begin{aligned} P = \left[ \begin{matrix} I &{} 0 &{} 0 \\ 0 &{} I &{} 0 \\ 0 &{} 0 &{} 0 \end{matrix} \right] \notin \mathcal {SC}(H_1 \oplus H_2 \oplus H_3). \end{aligned}$$

Our goal is to show that \(P \prec A\). Let C be an arbitrary non-scalar effect coexistent with P. Then, since C and P commute, the matrix form of C is

$$\begin{aligned} C = \left[ \begin{matrix} C_{11} &{} C_{12} &{} 0 \\ C_{12}^* &{} C_{22} &{} 0 \\ 0 &{} 0 &{} C_{33} \end{matrix} \right] \in {{\mathcal {E}}}(H_1 \oplus H_2 \oplus H_3). \end{aligned}$$

Consider the effect \(D := \varepsilon \cdot C\) and notice that

$$\begin{aligned} \varepsilon \cdot \left[ \begin{matrix} C_{11} &{} C_{12} &{} 0 \\ C_{12}^* &{} C_{22} &{} 0 \\ 0 &{} 0 &{} 0 \end{matrix} \right] \le I-A \;\; \text {and} \;\; \varepsilon \cdot \left[ \begin{matrix} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} C_{33} \end{matrix} \right] \le A. \end{aligned}$$

Clearly, by Lemmas 2.3 and 2.12 we have \(D\sim A\) and \(C^\sim \subseteq D^\sim \), which completes the proof. \(\square \)

Next, we characterise commutativity preservers on \({{\mathcal {P}}}(H)\). We note that the following theorem has been proved before implicitly in [23] for separable spaces, and was stated explicitly in [24, Theorem 2.8]. In order to prove the theorem for general spaces, one only has to use the ideas of [23], however, we decided to include the proof for the sake of completeness and clarity.

Theorem 4.3

Let H be a Hilbert space of dimension at least three and \(\phi :{{\mathcal {P}}}(H)\rightarrow {{\mathcal {P}}}(H)\) be a bijective mapping that preserves commutativity in both directions, i.e.

$$\begin{aligned} PQ = QP \;\; \iff \;\; \phi (P)\phi (Q) = \phi (Q)\phi (P) \qquad (P,Q \in {{\mathcal {P}}}(H)). \end{aligned}$$

Then there exists a unitary or antiunitary operator \(U:H\rightarrow H\) such that

$$\begin{aligned} \phi (P) \in \{ UPU^*, UP^\perp U^* \} \qquad (P \in {{\mathcal {P}}}(H)). \end{aligned}$$


For an arbitrary set \({\mathcal {M}} \subseteq {{\mathcal {P}}}(H)\) let us use the following notations: \({\mathcal {M}}^{\mathfrak {c}}:= {\mathcal {M}}^c \cap {{\mathcal {P}}}(H)\) and \({\mathcal {M}}^{{\mathfrak {c}}{\mathfrak {c}}} := ({\mathcal {M}}^{\mathfrak {c}})^{\mathfrak {c}}\). By the properties of \(\phi \) we immediately get \(\phi ({\mathcal {M}}^{\mathfrak {c}}) = \phi ({\mathcal {M}})^{\mathfrak {c}}\) and \(\phi ({\mathcal {M}}^{{\mathfrak {c}}{\mathfrak {c}}}) = \phi ({\mathcal {M}})^{{\mathfrak {c}}{\mathfrak {c}}}\) for all subset \({\mathcal {M}}\).

Next, let P and Q be two arbitrary commuting projections. Then (for instance by the Halmos’s two projections theorem) we have

$$\begin{aligned} P = \left[ \begin{matrix} I &{} 0 &{} 0 &{} 0 \\ 0 &{} I &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ \end{matrix}\right] \;\; \text {and} \;\; Q = \left[ \begin{matrix} I &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} I &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ \end{matrix}\right] \in {{\mathcal {B}}}(H_1\oplus H_2 \oplus H_3 \oplus H_4) \end{aligned}$$

where \(H_1 = \mathrm{Im\,}P \cap \mathrm{Im\,}Q\), \(H_2 = \mathrm{Im\,}P \cap \mathrm{Ker}Q\), \(H_3 = \mathrm{Ker}P \cap \mathrm{Im\,}Q\), \(H_4 = \mathrm{Ker}P \cap \mathrm{Ker}Q\) and \(H = H_1\oplus H_2 \oplus H_3 \oplus H_4\). Note that some of these subspaces might be trivial. We observe that

$$\begin{aligned} \{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = (\{P,Q\}^{\mathfrak {c}})^{\mathfrak {c}}&= \left\{ \left[ \begin{matrix} R_1 &{} 0 &{} 0 &{} 0 \\ 0 &{} R_2 &{} 0 &{} 0 \\ 0 &{} 0 &{} R_3 &{} 0 \\ 0 &{} 0 &{} 0 &{} R_4 \\ \end{matrix}\right] :R_j \in {{\mathcal {P}}}(H_j),\; j=1,2,3,4 \right\} ^{\mathfrak {c}}\\&= \left\{ \left[ \begin{matrix} \lambda _1 I &{} 0 &{} 0 &{} 0 \\ 0 &{} \lambda _2 I &{} 0 &{} 0 \\ 0 &{} 0 &{} \lambda _3 I &{} 0 \\ 0 &{} 0 &{} 0 &{} \lambda _4 I \\ \end{matrix}\right] :\lambda _j \in \{0,1\},\; j=1,2,3,4 \right\} . \end{aligned}$$

Hence we conclude that \(\#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = 2^{\#\{j:H_j \ne \{0\}\}}\). In particular, \(\#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = 2\) if and only if \(P,Q\in \{0,I\}\), and \(\#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} = 4\) if and only if either \(P\notin \{0,I\}\) and \(Q\in \{I,0,P,P^\perp \}\), or \(Q\notin \{0,I\}\) and \(P\in \{I,0,Q,Q^\perp \}\).

Now, we easily conclude the following characterisation of rank-one and co-rank-one projections:

$$\begin{aligned} P \;\text {or}\; P^\perp \in {{\mathcal {P}}}_1(H) \;\;\iff \;\; \#\{P,Q\}^{{\mathfrak {c}}{\mathfrak {c}}} \in \{4,8\} \;\;\text {holds for all}\; Q\in P^{\mathfrak {c}}. \end{aligned}$$

This implies that

$$\begin{aligned} \phi (\{P :P \;\text {or}\; P^\perp \in {{\mathcal {P}}}_1(H)\}) = \{P :P \;\text {or}\; P^\perp \in {{\mathcal {P}}}_1(H)\}. \end{aligned}$$

Note that we also have \(\phi (P^\perp ) = \phi (P)^\perp \) for every \(P\in {{\mathcal {P}}}(H)\), as \(P^{\mathfrak {c}}= Q^{\mathfrak {c}}\) holds exactly when \(P=Q\) or \(P+Q = I\). Since changing the images of some pairs of ortho-complemented projections to their orto-complementations does not change the property (35), we may assume without loss of generality that \(\phi ({{\mathcal {P}}}_1(H)) = {{\mathcal {P}}}_1(H)\). It is easy to see that two rank-one projections commute if and only if either they coincide, or they are orthogonal to each other. Thus, as \(\dim H \ge 3\), Uhlhorn’s theorem [32] gives that there exist a unitary or antiunitary operator \(U:H\rightarrow H\) such that

$$\begin{aligned} \phi (P) = UPU^* \qquad (P \in {{\mathcal {P}}}_1(H)). \end{aligned}$$

Finally, note that for every projection \(Q\in {{\mathcal {P}}}(H)\) we have

$$\begin{aligned} Q^{\mathfrak {c}}\cap {{\mathcal {P}}}_1(H) = \{ P\in {{\mathcal {P}}}_1(H) :\mathrm{Im\,}P \subset \mathrm{Im\,}Q \cup \mathrm{Ker}Q \}, \end{aligned}$$

from which we easily complete the proof. \(\square \)

Before we prove Theorem 1.3 in the general case, we need one more technical lemma for non-separable Hilbert spaces. We will use the notation \({{\mathcal {E}}}_{fs}(H)\) for the set of all effects whose spectrum has finitely many elements.

Lemma 4.4

For all \(A\in {{\mathcal {E}}}_{fs}(H)\) we have

$$\begin{aligned} A^{cc} = A'' \cap {{\mathcal {E}}}(H) = \{ p(A)\in {{\mathcal {E}}}(H) :p \text { is a polynomial} \}. \end{aligned}$$


We only have to observe the following for all \(A\in {{\mathcal {E}}}(H)\) with \(\#\sigma (A) = n \in {\mathbb {N}}\), where \(E_1, \dots E_n\) are the spectral projections and \(H_j = \mathrm{Im\,}E_j\) \((j=1,2,\dots n)\):

$$\begin{aligned} A^{cc}&= \left( \bigcap _{j=1}^n E_j^c \right) ^c = \left\{ \bigoplus _{j=1}^n B_j:B_j \in {{\mathcal {E}}}\left( H_j\right) \text { for all } j \right\} ^c \\&= \left\{ \sum _{j=1}^n \mu _j E_j :\mu _j \in [0,1] \text { for all } j \right\} \\&= \left\{ \bigoplus _{j=1}^n T_j :T_j \in {{\mathcal {B}}}(H_j) \text { for all } j \right\} ' \cap {{\mathcal {E}}}(H) = \left( \bigcap _{j=1}^n E_j' \right) ' \cap {{\mathcal {E}}}(H) = A'' \cap {{\mathcal {E}}}(H). \end{aligned}$$

\(\square \)

Now, we are in the position to prove our second main theorem in the general case.

Proof of Theorem 1.3 for spaces of dimension at least three

The proof will be divided into the following steps:

  1. 1

    we show that \(\phi \) maps \({{\mathcal {E}}}_{fs}(H)\) onto itself,

  2. 2

    we prove that \(\phi \) has the form (2) on \({{\mathcal {E}}}_{fs}(H)\setminus \mathcal {SC}(H)\),

  3. 3

    we show that \(\phi \) has the form (2) on \({{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)\).

STEP 1: First, similarly as in the previous section, we easily get the existence of a bijective function \(g:[0,1] \rightarrow [0,1]\) such that

$$\begin{aligned} \phi (t I) = g(t)I \qquad (t\in [0,1]). \end{aligned}$$

Of course, the properties of \(\phi \) imply \(\phi (A)^\sim = \phi (A^\sim )\) for all \(A\in {{\mathcal {E}}}(H)\), and also

$$\begin{aligned} B^\sim \subseteq A^\sim \;\;\iff \;\; \phi (B)^\sim \subseteq \phi (A)^\sim \qquad (A,B\in {{\mathcal {E}}}(H)). \end{aligned}$$

From the latter it follows that

$$\begin{aligned} B \prec A \;\;\iff \;\; \phi (B) \prec \phi (A) \qquad (A,B\in {{\mathcal {E}}}(H)\setminus \mathcal {SC}(H)). \end{aligned}$$

Hence by Lemma 4.2 we obtain

$$\begin{aligned} \phi (P(H)\setminus \{0,I\}) = P(H)\setminus \{0,I\}, \end{aligned}$$

and therefore Lemma 2.1 (b) implies that the restriction \(\phi |_{P(H)\setminus \{0,I\}}\) preserves commutativity in both directions. Applying Theorem 4.3 then gives that up to unitary–antiunitary equivalence and element-wise ortho-complementation, we have

$$\begin{aligned} \phi (P) = P \qquad (P\in P(H)\setminus \{0,I\}). \end{aligned}$$

From now on we may assume without loss of generality that this is the case.

Next, by the spectral theorem [7, Theorem IX.2.2] we have

$$\begin{aligned} A^c = \bigcap _{\Delta \in \mathcal {B}_{[0,1]}} E_A(\Delta )^c = \bigcap _{\Delta \in \mathcal {B}_{[0,1]}} E_A(\Delta )^\sim \qquad (A\in {{\mathcal {E}}}(H)). \end{aligned}$$

Therefore we obtain

$$\begin{aligned} \phi (A^c) = \bigcap _{\Delta \in \mathcal {B}_{[0,1]}} \phi (E_A(\Delta ))^\sim = \bigcap _{\Delta \in \mathcal {B}_{[0,1]}} E_A(\Delta )^\sim = A^c \quad (A\in {{\mathcal {E}}}(H)), \end{aligned}$$

and thus also

$$\begin{aligned} \phi (A^{cc}) = \phi \left( \bigcap _{B\in A^c} B^c\right) = \bigcap _{B\in A^c} \phi \left( B^c\right) = \bigcap _{B\in A^c} B^c = A^{cc} \qquad (A\in {{\mathcal {E}}}(H)). \end{aligned}$$

In particular, we have

$$\begin{aligned} \phi (A)\in A^{cc} \qquad (A\in {{\mathcal {E}}}(H)). \end{aligned}$$

Hence for all \(A\in {{\mathcal {E}}}_{fs}(H)\) there exists a polynomial \(p_A\) such that \(p_A(\sigma (A)) \subset [0,1]\) and

$$\begin{aligned} \phi (A) = p_A(A) \qquad (A\in {{\mathcal {E}}}_{fs}(H)). \end{aligned}$$

As a similar statement holds for \(\phi ^{-1}\), we immediately get \(\phi ({{\mathcal {E}}}_{fs}(H)) = {{\mathcal {E}}}_{fs}(H)\). Also, notice that \(\#\sigma (\phi (A)) = \#\sigma (p_A(A)) \le \#\sigma (A)\) and \(\#\sigma (\phi ^{-1}(A)) \le \#\sigma (A)\) hold for all \(A\in {{\mathcal {E}}}_{fs}(H)\). Whence we obtain

$$\begin{aligned} \#\sigma (\phi (A))= \#\sigma (A) \qquad (A\in {{\mathcal {E}}}_{fs}(H)). \end{aligned}$$

In particular, the restriction \(p_A|_{\sigma (A)}\) is injective.

STEP 2: Now, let M be an arbitrary two-dimensional subspace of H and let \(P_M\in {{\mathcal {P}}}(H)\) be the orthogonal projection onto M. Consider two arbitrary effects \(A,B\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H)\) which therefore have the following matrix representations:

$$\begin{aligned} A = \left[ \begin{matrix} A_M &{} 0 \\ 0 &{} A_{M^\perp } \end{matrix}\right] \quad \text {and}\quad B = \left[ \begin{matrix} B_M &{} 0 \\ 0 &{} B_{M^\perp } \end{matrix}\right] \in {{\mathcal {E}}}_{fs}(M \oplus M^\perp ). \end{aligned}$$


$$\begin{aligned} \phi (A) = p_A(A) = \left[ \begin{matrix} p_A(A_M) &{} 0 \\ 0 &{} p_A(A_{M^\perp }) \end{matrix}\right] \quad \text {and}\quad \phi (B) = p_B(B) = \left[ \begin{matrix} p_B(B_M) &{} 0 \\ 0 &{} p_B(B_{M^\perp }) \end{matrix}\right] . \end{aligned}$$

Note that by (39), the polynomial \(p_A\) acts injectively on \(\sigma (A)\), therefore

$$\begin{aligned} A_M\in \mathcal {SC}(M) \;\;\iff \;\; p_A(A_M)\in \mathcal {SC}(M), \end{aligned}$$

and of course, similarly for B. We observe that by Lemma 2.2 the following two equations hold:

$$\begin{aligned} A^\sim \bigcap \left[ \begin{matrix} I &{} 0 \\ 0 &{} 0 \end{matrix}\right] ^\sim \bigcap \left( \bigcap _{P\in {{\mathcal {P}}}_1(M^\perp )} \left[ \begin{matrix} 0 &{} 0 \\ 0 &{} P \end{matrix}\right] ^\sim \right) = \left\{ \left[ \begin{matrix} D &{} 0 \\ 0 &{} \lambda I \end{matrix}\right] :D\sim A_M, \lambda \in [0,1] \right\} \end{aligned}$$


$$\begin{aligned} \phi (A)^\sim \bigcap \left[ \begin{matrix} I &{} 0 \\ 0 &{} 0 \end{matrix}\right] ^\sim \bigcap \left( \bigcap _{P\in {{\mathcal {P}}}_1(M^\perp )} \left[ \begin{matrix} 0 &{} 0 \\ 0 &{} P \end{matrix}\right] ^\sim \right) = \left\{ \left[ \begin{matrix} D &{} 0 \\ 0 &{} \lambda I \end{matrix}\right] :D\sim p_A(A_M), \lambda \in [0,1] \right\} . \end{aligned}$$

It is important to observe that by (38) the set in (41) is the \(\phi \)-image of (40). Thus we obtain the following equivalence if \(A_M \notin \mathcal {SC}(M)\):

$$\begin{aligned} B_M \in \{A_M, A_M^\perp \} \;&\iff \; A_M^\sim = B_M^\sim \nonumber \\&\iff \; \left\{ \left[ \begin{matrix} D &{} 0 \\ 0 &{} \lambda I \end{matrix}\right] :D\sim A_M, \lambda \in [0,1]\right\} \nonumber \\&= \left\{ \left[ \begin{matrix} E &{} 0 \\ 0 &{} \mu I \end{matrix}\right] :E\sim B_M, \mu \in [0,1]\right\} \nonumber \\&\iff \; \left\{ \left[ \begin{matrix} D &{} 0 \\ 0 &{} \lambda I \end{matrix}\right] :D\sim p_A(A_M), \lambda \in [0,1]\right\} \nonumber \\&= \left\{ \left[ \begin{matrix} E &{} 0 \\ 0 &{} \mu I \end{matrix}\right] :E\sim p_B(B_M), \mu \in [0,1]\right\} \nonumber \\&\iff \; \left( p_A(A_M)\right) ^\sim = \left( p_B(B_M)\right) ^\sim \nonumber \\&\iff \; p_B(B_M) \in \left\{ p_A(A_M), I-p_A(A_M) \right\} . \end{aligned}$$

Now, we are in the position to use the previously proved two-dimensional version. Let

$$\begin{aligned} {\mathfrak {E}}(M) := \left\{ \left\{ D,D^\perp \right\} :D\in {{\mathcal {E}}}(M)\setminus \mathcal {SC}(M) \right\} \cup \{\mathcal {SC}(M)\}, \end{aligned}$$

and let us say that two elements of \({\mathfrak {E}}(M)\) are coexistent, in notation \(\approx \), if either one of them is \(\mathcal {SC}(M)\), or the two elements are \(\{D,D^\perp \}\) and \(\{E,E^\perp \}\) with \(D\sim E\). Clearly, the bijective restriction

$$\begin{aligned} \phi |_{(P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H)}:(P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H) \rightarrow (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H) \end{aligned}$$

induces a well-defined bijection on \({\mathfrak {E}}(M)\) by

$$\begin{aligned} \mathcal {SC}(M)\mapsto \mathcal {SC}(M), \; \{A_M,A_M^\perp \}\mapsto \{p_A(A_M), p_A(A_M)^\perp \} \qquad (A_M \notin \mathcal {SC}(M)). \end{aligned}$$

Notice that this map also preserves the relation \(\approx \) in both directions. Indeed, for all \(A,B\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H)\), \(A_M,B_M \notin \mathcal {SC}(H)\) we have

$$\begin{aligned} \{A_M, A_M^\perp \} \approx \{B_M, B_M^\perp \}&\;\iff \; {\hat{A}} := \left[ \begin{matrix} A_M &{} 0 \\ 0 &{} 0 \end{matrix}\right] \sim {\hat{B}} := \left[ \begin{matrix} B_M &{} 0 \\ 0 &{} 0 \end{matrix}\right] \\&\;\iff \; \left[ \begin{matrix} p_{{\hat{A}}}(A_M) &{} 0 \\ 0 &{} p_{{\hat{A}}}(0) I \end{matrix}\right] \sim \left[ \begin{matrix} p_{{\hat{B}}}(B_M) &{} 0 \\ 0 &{} p_{{\hat{B}}}(0) I \end{matrix}\right] \\&\;\iff \; \{p_{{\hat{A}}}(A_M), p_{{\hat{A}}}(A_M)^\perp \} \approx \{p_{{\hat{B}}}(B_M), p_{{\hat{B}}}(B_M)^\perp \} \\&\;\iff \; \{p_{A}(A_M), p_{A}(A_M)^\perp \} \approx \{p_{B}(B_M), p_{B}(B_M)^\perp \}. \end{aligned}$$

Therefore, using the two-dimensional version of Theorem 1.3, we obtain a unitary or antiunitary operator \(U_M:M\rightarrow M\) such that

$$\begin{aligned} p_A(A_M) \in \{U_M (A_M) U_M^*, U_M (A_M)^\perp U_M^*\} \qquad (A\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H), \; A_M\notin \mathcal {SC}(M)) \end{aligned}$$


$$\begin{aligned} p_A(A_M) \in \mathcal {SC}(M) \qquad (A\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H), \; A_M\in \mathcal {SC}(M)). \end{aligned}$$

Observe that this implies the following: for any pair of orthogonal unit vectors \(x,y\in M\) we must have either \(U_M({\mathbb {C}}\cdot x) = {\mathbb {C}}\cdot x\) and \(U_M({\mathbb {C}}\cdot y) = {\mathbb {C}}\cdot y\), or \(U_M({\mathbb {C}}\cdot x) = {\mathbb {C}}\cdot y\) and \(U_M({\mathbb {C}}\cdot y) = {\mathbb {C}}\cdot x\). As \(U_M\) is continuous, we have either the first case for all orthogonal pairs \({\mathbb {C}}\cdot x,{\mathbb {C}}\cdot y\), or the second for every such pair. But a similar statement holds for all two-dimensional subspaces, therefore it is easy to show that the second possibility cannot occur. Consequently, we have \(U_M({\mathbb {C}}\cdot x) = {\mathbb {C}}\cdot x\) for all unit vectors \(x\in M\), from which it follows that \(U_M\) is a scalar multiple of the identity operator. Thus we obtain the following for every two-dimensional subspace M:

$$\begin{aligned} p_A(A_M) \in \{A_M, A_M^\perp \} \qquad (A\in (P_M)^\sim \cap {{\mathcal {E}}}_{fs}(H), \; A_M\notin \mathcal {SC}(M)). \end{aligned}$$

From here it is rather straightforward to obtain

$$\begin{aligned} \phi (A) = p_A(A) \in \{A, A^\perp \} \qquad (A\in {{\mathcal {E}}}_{fs}(H)\setminus \mathcal {SC}(M)). \end{aligned}$$

STEP 3: Observe that (43) holds for every \(A\in {{\mathcal {F}}}(H)\), therefore an application of Theorem 1.1 and Corollary 2.5 completes the proof in the separable case. As for the general case, let us consider an arbitrary effect \(A\in {{\mathcal {E}}}(H)\setminus {{\mathcal {E}}}_{fs}(H)\) and an orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}} H_i\) such that each \(H_i\) is a separable invariant subspace of A. By (38) and Lemma 2.1 (b), each \(H_i\) is an invariant subspace also for \(\phi (A)\), in particular, we have

$$\begin{aligned} A = \oplus _{i\in \mathcal {I}} A_i, \;\;\text {and}\;\; \phi (A) = \oplus _{i\in \mathcal {I}} \mathcal {A}_i \in {{\mathcal {E}}}(\oplus _{i\in \mathcal {I}} H_i). \end{aligned}$$

Without loss of generality we may assume from now on that there exists an \(i_0\in \mathcal {I}\) so that \(A_{i_0}\) is not a scalar effect.

Now, let \(i\in \mathcal {I}\), \(F\in {{\mathcal {F}}}(H)\) and \(\mathrm{Im\,}F \subseteq H_{i}\) be arbitrary. Then by (43) we have

$$\begin{aligned} A_{i} \sim P_{i}F|_{H_{i}} \;\;\iff \;\; A\sim F \;\;\iff \;\; \phi (A)\sim F \;\;\iff \;\; \mathcal {A}_{i} \sim P_{i}F|_{H_{i}}. \end{aligned}$$

In particular, \(A_{i}^\sim \cap {{\mathcal {F}}}(H_{i}) = \mathcal {A}_{i}^\sim \cap {{\mathcal {F}}}(H_{i})\), therefore by Theorem 1.1 we get that for all i we have either \(A_i, \mathcal {A}_i\in \mathcal {SC}(H)\), or \(A_i = \mathcal {A}_i\), or \(\mathcal {A}_i = A_i^\perp \). By considering \(A^\perp \) instead of A if necessary, we may assume that we have \(A_{i_0} = \mathcal {A}_{i_0}\). Finally, for any \(i_1\in \mathcal {I}\setminus \{i_0\}\) let us consider the orthogonal decomposition \(H = \oplus _{i\in \mathcal {I}\setminus \{i_0,i_1\}} H_i \oplus (H_{i_0} \oplus H_{i_1}) \). Similarly as above, we then get \(A_{i_0}\oplus A_{i_1} = \mathcal {A}_{i_0} \oplus \mathcal {A}_{i_1}\), and the proof is complete. \(\square \)

5 A Remark on the Qubit Case

Here we visualise the set \(A^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\) for a general rank-one qubit effect A. First, let us introduce Bloch’s representation. Consider the following vector space isomorphism between the space of all \(2 \times 2\) Hermitian matrices \({{\mathcal {B}}}_{sa}({\mathbb {C}}^2)\) and \({\mathbb {R}}^4\), see also [4]:

$$\begin{aligned} \rho :{{\mathcal {B}}}_{sa}({\mathbb {C}}^2) \rightarrow {\mathbb {R}}^4, \quad \rho (A) = \rho (x_0\sigma _0+x_1\sigma _1+x_2\sigma _2+x_3\sigma _3) = (x_0,x_1,x_2,x_3), \end{aligned}$$


$$\begin{aligned} \sigma _0 = \left[ \begin{matrix} 1 &{} 0 \\ 0 &{} 1 \end{matrix}\right] , \; \sigma _1 = \left[ \begin{matrix} 0 &{} 1 \\ 1 &{} 0 \end{matrix}\right] , \; \sigma _2 = \left[ \begin{matrix} 0 &{} -i \\ i &{} 0 \end{matrix}\right] , \; \sigma _3 = \left[ \begin{matrix} 1 &{} 0 \\ 0 &{} -1 \end{matrix}\right] \end{aligned}$$

are the Pauli matrices. Clearly, we have \(\rho (0) = (0,0,0,0)\), \(\rho (I) = (1,0,0,0)\). The Bloch representation is usually defined as the restriction \(\rho |_{{{\mathcal {P}}}_1({\mathbb {C}}^2)}\) which maps \({{\mathcal {P}}}_1({\mathbb {C}}^2)\) onto a sphere of the three-dimensional affine subspace \(\{ (1/2,x_1,x_2,x_3) :x_j\in {\mathbb {R}}, j=1,2,3 \}\) with centre at (1/2, 0, 0, 0) and radius 1/2. Indeed, as the general form of a rank-one projection in \({\mathbb {C}}^2\) is

$$\begin{aligned} P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )} = \left[ \begin{matrix} \cos \vartheta \\ e^{i\mu } \sin \vartheta \end{matrix}\right] \left[ \begin{matrix} \cos \vartheta \\ e^{i\mu } \sin \vartheta \end{matrix}\right] ^* = \left[ \begin{matrix} \cos ^2\vartheta &{} e^{-i\mu } \cos \vartheta \sin \vartheta \\ e^{i\mu } \cos \vartheta \sin \vartheta &{} \sin ^2\vartheta \\ \end{matrix}\right] \end{aligned}$$

where \(0\le \vartheta \le \tfrac{\pi }{2}\) and \(0\le \mu < 2\pi \), a not too hard calculation gives that

$$\begin{aligned} \rho (P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )}) = \tfrac{1}{2} \cdot ( 1, \cos \mu \sin 2\vartheta , \sin \mu \sin 2\vartheta , \cos 2\vartheta ). \end{aligned}$$

Recall the remarkable angle doubling property of the Bloch representation, namely, we have \(\Vert P-Q\Vert = \sin \theta \) if and only if the angle between the vectors \(\rho (P) - \tfrac{1}{2}e_0\) and \(\rho (Q) - \tfrac{1}{2}e_0\) is exactly \(2\theta \).

Next, we call a positive (semi-definite) element of \({{\mathcal {B}}}_{sa}({\mathbb {C}}^2)\) a density matrix if its trace is 1, or in other words, if it is a convex combination of some rank-one projections. Therefore \(\rho \) maps the set of all \(2\times 2\) density matrices onto the closed ball of the three-dimensional affine subspace \(\{ (1/2,x_1,x_2,x_3) :x_j\in {\mathbb {R}}, j=1,2,3 \}\) with centre at (1/2, 0, 0, 0) and radius 1/2. Hence, we see that the cone of all positive (semi-definite) \(2\times 2\) matrices is mapped onto the infinite cone spanned by (0, 0, 0, 0) and the aforementioned ball. Thus \(\rho \) maps \({{\mathcal {E}}}({\mathbb {C}}^2)\) onto the intersection of this cone and its reflection through the point \(\rho (\tfrac{1}{2}I) = (\tfrac{1}{2},0,0,0)\).

We can re-write (44) as follows:

$$\begin{aligned} \rho (P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )}) = \tfrac{1}{2} \cdot ( e_0 + \sin 2\vartheta \cdot e_\mu + \cos 2\vartheta \cdot e_3), \end{aligned}$$


$$\begin{aligned} e_0 := (1,0,0,0), \; e_\mu := (0, \cos \mu , \sin \mu , 0), \; e_3 := (0, 0, 0, 1) \end{aligned}$$

is an orthonormal system in \({\mathbb {R}}^4\). Let \(S_\mu \) be the three-dimensional subspace spanned by \(e_0, e_\mu , e_3\). Then the set \(\rho ({{\mathcal {E}}}({\mathbb {C}}^2))\cap S_\mu \) can be visualised as a double cone of \({\mathbb {R}}^3\), by regarding \(e_0, e_\mu , e_3\) as the standard basis of \({\mathbb {R}}^3\), see Figure 2. Note that \(\rho ({{\mathcal {P}}}_1({\mathbb {C}}^2))\cap S_\mu \) is the circle where the boundaries of the two cones meet.

Fig. 2
figure 2

Illustration of \(\rho ({{\mathcal {E}}}({\mathbb {C}}^2))\cap S_\mu \). The circle is \(\rho ({{\mathcal {P}}}_1({\mathbb {C}}^2))\cap S_\mu \)

We continue with visualising the set \((tP_{(1, 0)})^\sim \) for an arbitrary \(0<t<1\). Note that then visualising \((tP)^\sim \) for a general rank-one projection P is very similar, we simply have to apply a unitary similarity (which by well-known properties of the Bloch representation, acts as a rotation on the sphere \(\rho ({{\mathcal {P}}}_1({\mathbb {C}}^2))\)). Equation (9) gives the following:

$$\begin{aligned}&\Lambda \left( tP_{(1, 0)}, P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )} \right) + \Lambda \left( I-tP_{(1,0)}, P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )} \right) \nonumber \\&\quad = \frac{1}{\tfrac{1}{t} \cos ^2\vartheta + \left( \tfrac{1}{0}\right) \sin ^2\vartheta } + \frac{1}{\tfrac{1}{1-t} \cos ^2\vartheta + \sin ^2\vartheta } = \left\{ \begin{array}{ll} \frac{1}{\tfrac{1}{1-t} \cos ^2\vartheta + \sin ^2\vartheta } &{} \text {if } \vartheta > 0 \\ 1 &{} \text {if } \vartheta = 0 \\ \end{array} \right. . \end{aligned}$$

Now, let us consider the vector

$$\begin{aligned} u = (2-t) \cdot e_0 + t \cdot e_3, \end{aligned}$$

which is orthogonal to

$$\begin{aligned} \rho \left( (1-t)P_{(1,0)}-P_{(1,0)}^\perp \right) = -\tfrac{1}{2}\left[ t\cdot e_0 + (t-2)\cdot e_3\right] . \end{aligned}$$

From here a bit tedious computation gives

$$\begin{aligned} \left\langle u, \; \frac{1}{ \tfrac{1}{1-t} \cos ^2\vartheta + \sin ^2\vartheta } \cdot \rho \left( P_{(\cos \vartheta , e^{i\mu } \sin \vartheta )} \right) - \rho \left( P_{(1,0)}^\perp \right) \right\rangle = 0 \quad (0\le \vartheta \le \tfrac{\pi }{2}). \end{aligned}$$

Therefore by Corollary 2.7 and (46) we conclude that \(\rho \left( (t P_{(1,0)})^\sim \cap {{\mathcal {F}}}_1({\mathbb {C}}^2)\right) \) is the union of the line segment \(\{\rho \left( s P_{(1,0)}\right) :0<s\le 1 \} = \{ \tfrac{s}{2} e_0 + \tfrac{s}{2} e_3 :0<s\le 1 \}\) and of the area on the boundary of \(\rho ({{\mathcal {E}}}({\mathbb {C}}^2))\) which is either on, or below the affine hyperplane whose normal vector is u and which contains \(\rho (P_{(1,0)}^\perp )\), see Figure 3. We note that using the notation of (22), the ellipse on the boundary is exactly the set

$$\begin{aligned} \left( \rho (\ell _{tP_{(1,0)}})\cup \big \{(1-t)\cdot \rho (P_{(1,0)}) ,\rho (P_{(1,0)}^\perp )\big \}\right) \cap S_\mu . \end{aligned}$$

Therefore \(\rho (\ell _{tP_{(1,0)}})\) is a punctured ellipsoid.

Fig. 3
figure 3

Illustration of \(\rho \left( (tP_{(1, 0)})^\sim \right) \cap \rho \left( {{\mathcal {F}}}_1({\mathbb {C}}^2)\right) \cap S_\mu \) (thick ellipse, thick line segment and the shaded area). The dotted circle is \(\rho ({{\mathcal {P}}}_1({\mathbb {C}}^2))\cap S_\mu \)

If one illustrates the set \(\rho \left( (A)^\sim \right) \cap \rho \left( {{\mathcal {F}}}_1({\mathbb {C}}^2)\right) \cap S_\mu \) with \(A, A^\perp \notin \mathcal {SC}({\mathbb {C}}^2) \cup {{\mathcal {F}}}_1({\mathbb {C}}^2)\) in the way as above, then one gets a set on the boundary of the cone which is bounded by a continuous closed curve containing the \(\rho \)-images of the spectral projections.

6 Final Remarks and Open Problems

First, we prove the analogue of Lemma 3.1 for finite dimensional spaces of dimension at least three.

Lemma 6.1

Let H be a Hilbert space with \(2\le \dim H < \infty \) and \(A\in {{\mathcal {E}}}(H)\). Then the following are equivalent:

  1. (i)

    \(0,1 \in \sigma (A)\),

  2. (ii)

    there exists no effect \(B\in {{\mathcal {E}}}(H)\) such that \(B^\sim \subsetneq A^\sim \).


If \(\dim H = 2\), then (i)\(\iff \)(ii) was proved in Lemma 3.1, so from now on we will assume \(2< \dim H < \infty \). Also, as the case when \(A\in \mathcal {SC}(H)\) is trivial, we assume otherwise throughout the proof.

(i)\(\Longrightarrow \)(ii): Suppose that \(0,1 \in \sigma (A)\) and consider an arbitrary effect B with \(B^\sim \subseteq A^\sim \). By Lemma 2.12, A and B commute. If \(0 = \lambda _1 \le \lambda _2 \le \dots \le \lambda _{n-1} \le \lambda _n = 1\) are the eigenvalues of A, then the matrices of A and B written in an orthonormal basis of joint eigenvectors are the following:

$$\begin{aligned} A = \left[ \begin{matrix} 0 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} \lambda _2 &{} \dots &{} 0 &{} 0 \\ \vdots &{} &{} \ddots &{} &{} \vdots \\ 0 &{} 0 &{} \dots &{} \lambda _{n-1} &{} 0 \\ 0 &{} 0 &{} \dots &{} 0 &{} 1 \\ \end{matrix}\right] \quad \text {and} \quad B = \left[ \begin{matrix} \mu _1 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{} \mu _2 &{} \dots &{} 0 &{} 0 \\ \vdots &{} &{} \ddots &{} &{} \vdots \\ 0 &{} 0 &{} \dots &{} \mu _{n-1} &{} 0 \\ 0 &{} 0 &{} \dots &{} 0 &{} \mu _n \\ \end{matrix}\right] \end{aligned}$$

with some \(\mu _1,\dots \mu _n \in [0,1]\). Notice that by Corollary 2.8, for all \(1\le i < j \le n\) we have

$$\begin{aligned} \left[ \begin{matrix} \mu _i &{} 0 \\ 0 &{} \mu _j \end{matrix}\right] ^\sim \subseteq \left[ \begin{matrix} \lambda _i &{} 0 \\ 0 &{} \lambda _j \end{matrix}\right] ^\sim . \end{aligned}$$

In particular, choosing \(i = 1, j = n\) implies either \(\mu _1 = 0\) and \(\mu _n = 1\), or \(\mu _1 = 1\) and \(\mu _n = 0\). Assume the first case. If we set \(i=1\), then Lemma 3.2 and (47) imply \(\mu _j \ge \lambda _j\) for all \(j = 2,\dots , n-1\). But on the other hand, setting \(j=n\) implies \(\mu _i \le \lambda _i\) for all \(i = 2,\dots , n-1\). Therefore we conclude \(B = A\). Similarly, assuming the second case implies \(B=A^\perp \).

(ii)\(\Longrightarrow \)(i): Assume (i) does not hold, then there exists a positive number \(\varepsilon \) such that \(\sigma (A) \subseteq [0,1-\varepsilon ]\) or \(\sigma (A^\perp ) \subseteq [0,1-\varepsilon ]\). Suppose the first possibility holds, then \(\tfrac{1}{1-\varepsilon }A \notin \{A,A^\perp \}\) and \(\left( \tfrac{1}{1-\varepsilon }A\right) ^\sim \subseteq A^\sim \). The second case is very similar. \(\square \)

We only proved the above lemma and Corollary 2.11 in the finite dimensional case. The following two questions would be interesting to examine:

Question 6.2

Does the statement of Corollary 2.11 remain true for general infinite dimensional Hilbert spaces?

Question 6.3

Does the statement of Lemma 6.1 hold if \(\dim H \ge \aleph _0\)?

Finally, our first main theorem characterises completely when \(A^\sim = B^\sim \) happens for two effects A and B. However, we gave only some partial results about when \(A^\sim \subseteq B^\sim \) occurs, e.g. Lemma 2.12.

Question 6.4

How can we characterise the relation \(A^\sim \subseteq B^\sim \) for effects AB?

We believe that a complete answer to this latter question would represent a substantial step towards the better understanding of coexistence.