Uncertainty from the Aharonov–Vaidman identity

In this article, I show how the Aharonov–Vaidman identity \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A\left| \psi \right\rangle = \left\langle A \right\rangle \left| \psi \right\rangle + \Delta A \left| \psi ^{\perp }_A \right\rangle $$\end{document}Aψ=Aψ+ΔAψA⊥ can be used to prove relations between the standard deviations of observables in quantum mechanics. In particular, I review how it leads to a more direct and less abstract proof of the Robertson uncertainty relation \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta A \Delta B \ge \frac{1}{2} \left| \left\langle [A,B] \right\rangle \right| $$\end{document}ΔAΔB≥12[A,B] than the textbook proof. I discuss the relationship between these two proofs and show how the Cauchy–Schwarz inequality can be derived from the Aharonov–Vaidman identity. I give Aharonov–Vaidman based proofs of the Maccone–Pati uncertainty relations and show how the Aharonov–Vaidman identity can be used to handle propagation of uncertainty in quantum mechanics. Finally, I show how the Aharonov–Vaidman identity can be extended to mixed states and discuss how to generalize the results to the mixed case.


Introduction
Let A be a Hermitian operator on a Hilbert space H.Then, for any (not necessarily normalized) vector |ψ ∈ H, where A = ψ|A|ψ / ψ|ψ is the expectation value of A, ∆A = A 2 − A 2 is its standard deviation, and ψ ⊥ A is a vector that is orthogonal to |ψ , has equal norm ψ ⊥ A ψ ⊥ A = ψ|ψ , and depends on the operator A.
Equation ( 1) is the Aharonov-Vaidman Identity, which first appeared in [1].Yakir Aharonov has stated that he "[does not] understand why it doesn't appear in every quantum book" [2].The main purpose of this article is to explain why it should appear in undergraduate quantum mechanics textbooks 1 .
1 Other demonstrations of the usefulness of the Aharonov-Vaidman identity include its use in the proof that, for any state |ψ and any observable A, |ψ ⊗n is an approximate eigenstate of the observable Ā = 1 n n j=1 A j for large n, where A j refers to A acting on the j th subsystem [1], and its use in deriving the minimum time required for evolution to an orthogonal quantum state [3].
The uncertainty relation that is proved most often in quantum mechanics classes and textbooks is the Robertson relation [4]: where [A, B] = AB − BA is the commutator.As pointed out by Schrödinger [5], the Robertson relation can be extended to where {A, B} = AB + BA is the anti-commutator.
Although not often emphasized in quantum mechanics classes, the Schrödinger relation is not harder to prove than the Robertson relation.In fact, the standard textbook proof of the Robertson relation effectively proves the Schrödinger relation and then throws away the anti-commutator term.
The proof almost universally adopted in textbooks is based on the Cauchy-Schwarz inequality.While this proof is elementary for those familiar with the mathematics of Hilbert spaces, it can be daunting for undergraduate physics students, who are likely encountering Hilbert spaces for the first time along with quantum mechanics.
In this article, I will review more direct proofs of eq. ( 2) and eq.( 3) from the Aharonov-Vaidman identity that only make use of basic properties of complex numbers and inner products.These proofs previously appeared in [6] and the proof of the Robertson relation is also problem 3.10 in Aharonov and Rohrlich's book "Quantum Paradoxes" [7].The proof of the Aharonov-Vaidman identity itself is uses similar ideas to one of the standard proofs of the Cauchy-Schwarz identity, but is perhaps more memorable to undergraduate physics students because it uses concepts that have a physical meaning, i.e. expectation values and standard deviations.The proof of the Robertson and Schrödinger relations so obtained is not independent of the standard Cauchy-Schwarz based proof.I shall discuss their relationship and show that the Cauchy-Schwarz inequality can itself be derived from eq. (1).The main virtue of using the Aharonov-Vaidman based proof of the uncertainty relation is that it is more direct and involves fewer abstractions.
To be clear, I am not against using or teaching the Cauchy-Schwarz inequality.It has been called "one of the most widely used and important inequalities in all of mathematics" [8].In fact, the Aharonov-Vaidman based proof still uses one instance of the Cauchy-Schwarz inequality, namely that if |ψ and |φ are unit vectors then | φ|ψ | ≤ 1.But this is easily motivated by the idea that φ|ψ is a generalization of the cosine of an angle, and it is used in a more direct way than in the standard proof.Students of quantum mechanics also need to know the Cauchy-Schwarz inequality to prove that the Born rule always yields well-defined probabilities.Physics students should learn the Cauchy-Schwarz inequality.I just think it should be used in a less abstract way where possible.
Besides the Robertson and Schrödinger relations, many other uncertainty relations are known.Indeed, since uncertainty relations have found applications in quantum information science [9,10,11,12,13,14,15] and quantum foundations [16,17], proving new ones has become something of a sport.The two most common classes of uncertainty relations are those based on entropy [18] and those based on standard deviations [4,5,19].Many of the standard deviation based relations can be derived from the Aharonov-Vaidman relation.I include a proof of the Maccone-Pati uncertainty relations [20] to illustrate this.While these are not the most recent or tightest known uncertainty relations, I include them because they have a simple and elegant Aharonov-Vaidman based proof.For more recent work on standard deviation uncertainty relations, see [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41].
Another place where relationships between standard deviations are important is in the propagation of uncertainty.In classical statistics, if random variables and their correlations if the variables are not independent).Formulas for the propagation of uncertainty tell us how to compute this function, and are commonly used to estimate experimental errors.In quantum mechanics, similar formulas can be derived relating the standard deviations of observables.They differ from their classical counterparts due to the fact that quantum observables do not commute, but provided this is taken care of they can be derived by the same methods as in the classical case.However, they can alternatively be derived from the Aharonov-Vaidman identity, as I shall explain.
Although the Aharonov-Vaidman identity is usually discussed for pure quantum states, it can be extended to mixed states, either by use of purification or an equivalent concept called an amplitude operator.Relations between standard deviations can be extended to mixed states, but obtaining tight bounds is sometimes more difficult than in the pure case due to the need to optimize over all purifications or amplitude operators that can represent a given mixed state.
The remainder of this article is structured as follows.Section 2 gives the proof of the Aharonov-Vaidman identity and a corollary that is useful for understanding the equality conditions in uncertainty relations.Section 3 presents the proof of the Robertson and Schrödinger relations based on the Aharonov-Vaidman identity.Section 4 explains the relationship with the standard textbook proof of the Robertson relation and explains how the Cauchy-Schwarz inequality can be derived from the Aharonov-Vaidman identity.Section 5 comments on the effective teaching of the Robertson uncertainty relations via the Aharonov-Vaidman identity.Section 6 presents Aharonov-Vaidman based proofs of the Maccone-Pati uncertainty relations.Section 7 describes how to use the Aharonov-Vaidman identity to derive formulas for the propagation of quantum uncertainty.Section 8 explains how to generalize the Aharonov-Vaidman relation to mixed states using amplitude operators.(The relationship between amplitude operators and purifications is discussed in appendix A.) Finally, section 9 presents the summary and conclusions.
I intend this article to be pedagogical and self-contained, so as to be accessible to undergraduate students and anyone teaching introductory quantum mechanics.

Proof of the Aharonov Vaidman Identity
Sometimes, it is useful to generalize the Aharonov-Vaidman identity to non-Hermitian operators, so we prove the more general version here.
Proposition 2.1 (The Aharonov-Vaidman Identity).Let A be a linear operator on a Hilbert space H and let |ψ be a (not necessarily normalized) vector in H.Then, where

and ψ ⊥
A is a vector orthogonal to |ψ that depends on both |ψ and A and satisfies Note that, if A is Hermitian, then this reduces to eq. ( 1), where A and ∆A are the expectation value and standard deviation.In general, A is a complex number, but ∆A is always real and non-negative.
For most of what we need to do, it is sufficient to consider the case where |ψ is a unit vector, in which case ψ ⊥ A is also a unit vector.The exception is the proof of the Cauchy-Schwarz inequality (proposition 4.1 in section 4), which uses the identity with an unnormalized vector.
Rearranging this gives α = A .To determine β, substitute α = A into eq.( 5) and take the inner product of A |ψ with itself to obtain where we have used ψ ⊥ ψ ⊥ = ψ|ψ .
Rearranging and using This means that β = (∆A)e iθ for some phase angle θ.If we define ψ ⊥ A = e iθ ψ ⊥ then ψ ⊥ A is still orthogonal to |ψ , its norm is unchanged, and we have eq.( 4).
The following corollary is useful for finding the conditions for equality in uncertainty relations.
Corollary 2.2.In general, for two operators A and B, and for a unit vector |ψ , Proof.From proposition 2.1, we have Taking the inner product of these gives Rearranging gives the desired result.
Note that, if A and B are Hermitian then we have If it is also the case that [A, B] = 0 then eq. ( 12) is the correlation, denoted corr A,B , that would be obtained from a joint measurement of A and B. The correlation is a well-known statistical measure of how two random variables are related to one another.Equation ( 12) is a formal generalization of the correlation, so we will also denote it corr A,B .However, if A and B do not commute then corr A,B is generally a complex number, there is no joint measurement of A and B of which corr A,B could be the correlation, and AB is not an observable.The real and imaginary parts of corr A,B are Re (corr The real part is also a formal generalization of the correlation in that it reduces to the classical formula when A and B commute.We denote it Rcorr A,B .

The Robertson and Schrödinger Uncertainty Relations
We are now in a position to prove the Robertson and Schrödinger uncertainty relations.
Proposition 3.1 (The Robertson Uncertainty Relation).Let A and B be two Hermitian operators on a Hilbert space H.Then, for any unit vector |ψ ∈ H Proof.From the Aharonov-Vaidman identity, we have Taking the inner product of these two equations and its complex conjugate gives Subtracting these two equations gives or, Since ψ ⊥ B ψ ⊥ A is the complex conjugate of ψ ⊥ A ψ ⊥ B , we can rewrite this as Taking the absolute value of both sides and rearranging gives Because ψ ⊥ A and ψ ⊥ B are unit vectors, 0 ≤ ψ ⊥ A ψ ⊥ B 2 ≤ 1, and hence the absolute value of the imaginary part of ψ ⊥ B ψ ⊥ A is also bounded between 0 and 1.Hence, we have The condition for equality in the Robertson relation is Im States that saturate the inequality are called (Robertson) intelligent states.The condition corr A,B = ±i can be used to find intelligent states, although this is not easier than solving for equality in the Robertson relation directly.

Proposition 3.2 (The Schrödinger Uncertainty Relation).
Let A and B be two Hermitian operators on a Hilbert space H.Then, for any unit vector |ψ ∈ H Proof.Taking the sum of eq. ( 18) and eq.( 19) gives or, Adding this to eq. ( 21) gives or, Now, because A and B are Hermitian, {A, B} is Hermitian and [A, B] is anti-Hermitian.Therefore {A, B} is real and [A, B] is imaginary.Further A , B , ∆A and ∆B are real.Therefore, taking the modulus squared of eq. ( 29) gives Finally, because ψ ⊥ A and ψ ⊥ B are unit vectors, we have The condition for equality in the Schrödinger relation is States that saturate the inequality are called (Schrödinger) intelligent states.The condition |corr A,B | 2 = 1 can be used to find intelligent states, although this is not easier than solving for equality in the Schrödinger relation directly.

The Textbook Proof and The Cauchy-Schwarz Inequality
The textbook proofs of the Robertson and Schrödinger uncertainty relations are based on the Cauchy-Schwarz inequality Note that the proofs given in section 3 also make use of a special case of this inequality: that for unit vectors My aim is not to eliminate any use of the Cauchy-Schwarz inequality, but just to argue that the proof is more memorable if the inequality is applied in a different way than in the standard proof.
In the standard proof, the Cauchy-Schwarz inequality is applied to the two vectors |f = A few lines of messy algebra and cancellations, which I will spare you the details of, yields from which we can derive the Schrödinger and Robertson relations by recognizing the real and imaginary parts of the right hand side.As physics students do not often see the Cauchy-Schwarz inequality prior to their first course on quantum mechanics, most textbooks include a proof of this as well.One of the common proofs uses reasoning similar to that which we used to establish the Aharonov-Vaidman identity.It starts by recognizing that |g can be written as where f ⊥ is a unit vector that is orthogonal to |f .To find α, take the inner product of this with |f , which yields α = f |g / f |f .Substituting this back into eq.( 34) and then taking the inner product of |g with itself gives The Cauchy-Schwarz inequality follows from this by recognizing that |β| 2 is real and nonnegative.
Summarizing, the standard proof of the Robertson inequality consists of: proving the Cauchy-Schwarz inequality and then finding convenient vectors to insert into the inequality that will yield terms involving ∆A and ∆B after some algebra.From the Aharonov-Vaidman identity, we can see that the reason the choice |f = (A − A ) |ψ and |g = (B − B ) |ψ is guaranteed work is that |f = ∆A ψ ⊥ A and |g = ∆B ψ ⊥ B .After inserting these choices, one has to multiply out and simplify the expressions in the Cauchy-Schwarz inequality.This involves recognizing things like A ψ|A|ψ = A 2 and then canceling several terms.It is difficult for students to follow the full details of this in a lecture.In the approach using the Aharonov-Vaidman relation, we already have expressions involving ∆A and ∆B, so it is easier to see how to get an expression involving ∆A∆B.This expression has fewer terms and there is less cancellation to do.
Although the approach using the Aharonov-Vaidman identity uses the Cauchy-Schwarz inequality in a less convoluted way, it uses similar mathematical ideas.For vectors |f and |g , we can write |g in terms of |f and an orthogonal vector, as in the proof of Cauchy-Schwarz, or we can write both vectors in terms of a third vector |h as where h ⊥ The advantage of this approach is that it immediately yields expressions involving the expectation values and standard deviations of the observables, which it is easy to see what to do with in order to get the uncertainty relations.From this point of view, the standard proof looks like shoehorning something into the Cauchy-Schwarz inequality that will yield standard deviations, and then backtracking to a point more easily obtained from the Aharonov-Vaidman identity.At the end of the day, both approaches use the same mathematics, but the Aharonov-Vaidman approach does so in a simpler and more direct way.
I would go so far as to say that whenever you are tempted to use the Cauchy-Schwarz inequality to prove a relationship between standard deviations of observables in quantum mechanics, you will have an easier time working from the Aharonov-Vaidman identity (and the special case | f |g | 2 ≤ 1 of the Cauchy-Schwarz inequality for unit vectors) instead.Section 6 and Section 7 give more examples of this.
I end this section by showing that you can prove the Cauchy-Schwarz inequality from the Aharonov-Vaidman identity.I include this not because I think it is the best way to prove the Cauchy-Schwarz inequality, but because finding alternative proofs of the Cauchy-Schwarz inequality is the mathematician's equivalent of the sport of finding new uncertainty relations in quantum mechanics.It also shows that, in principle, there is nothing that can be proved using the Cauchy-Schwarz inequality that could not be proved using the Aharonov-Vaidman identity.Of course, outside the context of standard deviations in quantum mechanics, using the Aharonov-Vaidman identity instead of the Cauchy-Schwarz inequality is unlikely to yield a better proof.
Proof.First note that the inequality trivially holds whenever f |g = 0 and that f |f = 0 implies f |g = 0. Therefore, we can assume that both f |g = 0 and f |f > 0.
Let P = |g g|.Note this is not necessarily a projector because |g does not have to be normalized, but it is a Hermitian operator.Applying the Aharonov-Vaidman identity to P and |f gives or equivalently Taking the inner product with f ⊥ P gives where we used the fact that f ⊥ P f ⊥ P = f |f Rearranging and taking the complex conjugate gives Now, taking the inner product of eq. ( 40) with |g gives Multiplying both sides by f |f / g|f gives Substituting eq. ( 42) into this gives or Now, the terms ∆P , f |f and | f |g | are all real and non-negative.Hence,

Pedagogical Notes
In order to teach the Robertson uncertainty relation via the Aharonov-Vaidman identity, you first have to establish the Aharonov-Vaidman identity.For the purposes of proving the Robertson uncertainty relation, it is sufficient to restrict the operator in the identity to be Hermitian and the vector |ψ to be a unit vector, as I shall in this section.In my experience, not all students immediately understand why, given a unit vector |ψ , any other unit vector |φ can be written as where ψ ⊥ is a unit vector orthogonal to |ψ .They will probably have seen Gram-Schmidt orthogonalization in a linear algebra class, but may have difficulty using that knowledge here due to the jump to abstract Hilbert spaces and Dirac notation.To aid intuition, I remark that |ψ and |φ span a two-dimensional subspace of H and show them fig. 1.By the process of Gram-Schmidt orthogonalization, we can construct an orthornormal basis for this subspace consisting of |ψ and from which we have eq.( 48) with α = ψ|φ and In my quantum mechanics classes, I set students in-class activities that involve things like deriving important equations or making order of magnitude estimates.These take about 5-10 minutes each and are done in pairs.I usually do two or three such activities per class.I believe this increases active engagement and retention of the main principles.I try to reduce the number of long derivations that I do myself on the board because I think they cause confusion about what the most important equations are and the derivations are rarely remembered by the students.However, I also do not want to set the students a long and complicated derivation to do themselves in class, so I try to find shorter derivations that they can do with guidance instead.The proof of the Robertson relation from the Aharonov-Vaidman relation is better suited to this approach than the standard proof.
After establishing eq. ( 48), I set students the following activity.

In Class Activity
Given that A |ψ = α |ψ + β ψ ⊥ , find α and β in terms of the expectation value A and standard deviation ∆A of A in the state |ψ .
Although some students can do this straight away, most need some help.During the course of the activity, I walk around the class to get an idea of how they are doing.When it seems like many students are stuck, I reveal the following three hints in sequence.

Try taking the inner product of A |ψ with itself.
Although most students can get α = A either straight away or after the first hint, |β| = ∆A is more challenging.After taking the inner product with |ψ , the obvious instinct is to take the inner product with ψ ⊥ , which does not help, so the third hint is usually needed.After this, it is a short hop to the Robertson relation via the proof given in section 3.
I think it would be more difficult to teach the standard proof in this way.One would either have to ask the students to derive the Cauchy-Schwarz inequality for themselves or derive the Robertson relation from Cauchy-Schwarz.The former is a bit abstract for a quantum mechanics class and the latter involves a lot of algebra and cancellations with a high potential for making mistakes.Both would require a large number of hints.In contrast, the proof of the Aharonov-Vaidman identity is relatively short, and I think that students who retain the identity are more likely to be able to reconstruct the proof of the Robertson relation for themselves.

Other Uncertainty Relations for Standard Deviations
Despite the ubiquity of the Schrödinger-Robertson uncertainty relations in quantum mechanics classes, there are good reasons to go beyond them.For example, consider a spin-1/2 particle with spin operators S x , S y and S z .For this case, the Robertson uncertainty is ∆S x ∆S y ≥ | S z |.Let |x+ be the spin-up state in the x direction.For this state we have S z = 0, which is perfectly valid because |x+ is an eigenstate of S x and hence ∆S x = 0.However, because [S x , S y ] = 0 there is necessarily some uncertainty in S y and in fact ∆S y = /2.The Schrödinger relation also yields ∆S x ∆S y ≥ 0. So the Schrödinger-Robertson relations do not capture all uncertainty trade-offs that necessarily exist in quantum mechanics.
More generally, for bounded operators A and B, any uncertainty relation of the form ∆A∆B ≥ f (A, B, |ψ ) for some function f must necessarily have f (A, B, |ψ ) = 0 whenever |ψ is an eigenstate of A or B. For this reason, it makes sense to seek uncertainty relations that bound the sum of standard deviations ∆A + ∆B, the sum of variances (∆A)2 + (∆B) 2 , or more exotic combinations.We shall discuss the Maccone-Pati relations, and some simple generalizations, in this section.
Uncertainty relations are classified as either state dependent or state independent, depending on whether the right hand side of the inequality depends on the state |ψ .For two observables A and B, a state dependent uncertainty relation is of the form f (∆A, ∆B) ≥ g(A, B, |ψ ), where f and g are specified functions, whereas a state independent uncertainty relation would be of the form f (∆A, ∆B) ≥ g(A, B), noting that g is no longer allowed to depend on |ψ .
On the face of it, a state dependent uncertainty relation is a strange idea, since, for any given normalized state |ψ , we can always just calculate the uncertainties ∆A and ∆B and get the exact value of f (∆A, ∆B).Therefore, bounds on uncertainty that apply to all states seem more useful.
However, a state dependent uncertainty relation can be a useful step in deriving a state independent one.This can happen in two ways.First, it may happen that, for a particular choice of the observables A and B, the function g(A, B, |ψ ) turns out not to depend on |ψ .For example, the Robertson relation ∆A∆B ≥ for other classes of observable, such as spin components, is more questionable.Despite the fact that I have asked students to compute it for states of a spin-1/2 particle as a homework problem, I do not think there is ever a need to do this in practice, as it is just as easy to calculate the exact uncertainties.
The second way of obtaining a state independent uncertainty relation from a state dependent one is to optimize, i.e. if f (∆A, ∆B) ≥ g(A, B, |ψ Of course, if f (∆A, ∆B) = ∆A∆B and A and B are bounded operators then this leads to the trivial relation ∆A∆B ≥ 0 because we can choose |ψ to be an eigenstate of either A or B. However, for sums and more general combinations of observables, optimization can lead to a nontrivial relation.
Further, if we are considering a set of experiments that can only prepare a subset of the possible states, then we can get an uncertainty relation that applies to those states by optimizing over the subset.An example might be experiments in which we can only prepare the system in a Gaussian state.Although this does not yield a state independent uncertainty relation, it is more useful than a completely state dependent one, as it allows us to bound the possible uncertainties for a class of relevant states.
To summarize, state dependent uncertainty relations are a strange idea, and I am not sure whether they would ever have been considered had not Robertson introduced one as a way-point in proving the Heisenberg relation.However, they can be useful in proving more generally applicable uncertainty relations.The relations that we discuss here are state dependent.
The remainder of this section is structured as follows.In section 6.1 we prove two propositions called the sum relations that will be used repeatedly using the Aharonov-Vaidman identity.In section 6.2, we give an Aharonov-Vaidman based proof of the Maccone-Pati uncertainty relations, and in in section 6.3 we give some simple generalizations.

The Sum Relations
Proposition 6.1.Let A and B be linear operators acting on H.Then, for any |ψ ∈ H, Proof.Apply the Aharonov-Vaidman identity to A + B in two different ways.The first way is

The Maccone-Pati Uncertainty Relations
Between the time of Robertson's uncertainty relation and now, there has always been some literature on uncertainty relations for variances and standard deviations.However, the field was reinvigorated in 2014, when Maccone and Pati [20] proved a pair of uncertainty relations for sums of variances, which always give a nontrivial bound, even in the case of an eigenstate of an observable.
Theorem 6.4 (The First Maccone-Pati Uncertainty Relation).Let A and B be Hermitian operators on a Hilbert space H and let |ψ ∈ H be a unit vector.Then, where ψ ⊥ is any unit vector orthogonal to |ψ .
Proof.We will prove (∆A 2 by applying the Aharonov-Vaidman identity to (A + iB).The proof of the other inequality follows by replacing A + iB with A − iB.Note that, even though A and B are Hermitian, A + iB is not, so it is crucial that we previously generalized the Aharonov-Vaidman identity to arbitrary linear operators.Applying the Aharonov-Vaidman identity to A + iB gives Taking the inner product with any unit vector ψ ⊥ orthogonal to |ψ gives and taking the modulus squared of this gives The result now follows by expanding (∆(A + iB)) 2 as follows.
Theorem 6.5 (The Second Maccone-Pati Uncertainty Relation).Let A and B be linear operators on a Hilbert space H and let |ψ ∈ H be a unit vector.Then, Proof.Applying the Aharonov-Vaidman identity to A + B gives Taking the inner product with ψ ⊥ A+B gives where the second line follows from the sum relation.We could stop here and regard ∆A+∆B ≥ ψ ⊥ A+B (A + B) ψ as an uncertainty relation, but Maccone and Pati wanted a relation in terms of variances to compare to their first result.To do this, we take the modulus squared of both sides to obtain The result now follows from the real number inequality x 2 + y 2 ≥ 1 2 (x + y) 2 with x = ∆A and y = ∆B.For completeness, this inequality is proved as follows.

Generalizations
Generalizations of the Maccone-Pati Uncertainty relations can be obtained by applying the Aharonov-Vaidman identity to more general linear combinations αA + βB, where α, β ∈ C.This gives Applying the strategy we used to prove theorem 6.4, we can take the inner product of this with an arbitrary unit vector ψ ⊥ that is orthogonal to |ψ , which gives We can now take the modulus squared of this and recognize that 0 Next, we can expand ∆(αA + βB) and rearrange to obtain Substituting α = 1, β = i and α = 1, β = −i immediately yields the first Maccone-Pati Uncertainty Relation.
Alternatively, we can apply the strategy used to prove theorem 6.5.Starting from eq. ( 57), we can take the inner product with ψ ⊥ αA+βB and rearrange to obtain Using the sum relation, together with ∆(αA) = |α|∆A gives Finally, squaring and using the inequality The inequalities eq. ( 58) and eq.( 59) are related to some of the generalizations of the Maccone-Pati uncertainty relations that have previously appeared in the literature [21,28].For example, eq. ( 58) can be used to derive an uncertainty relation that has appeared in the literature under the name "weighted uncertainty relation" [28].To do so, we set α = √ λ, β = ±i/ √ λ in eq. ( 58), where λ > 0. This yields This is an uncertainty relation in its own right, but the relation in [28] comes from adding this to eq. ( 55), which yields where ψ ⊥ 1 and ψ ⊥ 2 are (possibly different) unit vectors that are orthogonal to |ψ .This is intended as a simple example of a generalization that is easily obtained from the Aharonov-Vaidman identity, but I expect many other uncertainty relations that are usually proved using the Cauchy-Schwarz inequality or the parallelogram law would also have simple Aharonov-Vaidman based proofs.

Quantum Propagation of Uncertainty
In this section, we develop generalizations of the classical formulas for the propagation of uncertainty.We start with the case of linear functions in section 7.1, for which exact formulas are easy to obtain, before moving on to the general, possibly nonlinear, case in section 7.2, for which we have to employ a Taylor series approximation.

Linear Functions
We start with the simplest case: a sum of two observables.Classically, if A and B are random variables then Consider an experiment consisting of multiple runs.On each run, the quantities A and B are measured.These quantities are formalized as random variables because we assume that our experiments are subject to random statistical fluctuations, and that the "true" values that we are seeking are the means A and B of these random processes.We then use the average values calculated from the data as estimates of A and B , and the standard deviations as a measure of the error in our experiment.If we are actually interested in the quantity A + B then we would sum the averages to form our estimate of A + B , and we would use eq.( 60) to determine the error in our estimate of A + B .Using eq. ( 60) in this way is called the propagation of uncertainty or propagation of error.
If the random variables, A and B are independent, which would be the case if the randomness were due to independent statistical errors, then corr A,B = 0 and we would have which is the formula for propagation of uncertainty that is most commonly used in practice.
We now want to generalize these formulas by replacing classical random variables with quantum observables.The generalization of eq. ( 60) is as follows.
Theorem 7.1.Let A and B be Hermitian operators on a Hilbert space H.Then, Proof.Proposition 6.1 implies that, for any unit vector |ψ ∈ H, Taking the inner product of this with itself gives Applying eq. ( 13) completes the proof.
Although theorem 7.1 is a true theorem about quantum observables, it cannot be used to propagate uncertainty in the same way as its classical counterpart.Classically, we can measure A and B together in the same run of the experiment.We can then estimate A + B by summing the average values of A and B that we found in the experiment.We also have all the information we need to calculate the uncertainty ∆(A + B), i.e. ∆A, ∆B, A , B and AB , so we can determine the uncertainty without doing any more experiments.
In quantum mechanics, this is not the case.When A and B do not commute, they cannot both be accurately measured on the same run of an experiment.We can still estimate their expectation values by measuring A on half of the runs of the experiment and B on the other half and taking averages.Since A + B = A + B , summing these averages is still a way of estimating A + B .However, we do not have enough information to calculate ∆(A + B).The reason is that ∆(A + B) is the uncertainty in a direct measurement of A + B. Since A and B do not commute, this requires a different experimental setup from a measurement of A and B alone.
If we wanted to use eq.( 61) to calculate ∆(A + B), we would also have to estimate {A, B} .The most straightforward way of doing this would be to measure the observable {A, B} = AB + BA, but this requires yet another different experimental setup, and one that is likely to be at least as complicated as measuring A + B directly.
An exception to this are cases where {A, B} = cI for some constant c, in which case {A, B} = c regardless of the state.In particular, this is true of the Pauli observables σ x , σ y , σ z of a qubit for which {σ j , σ k } = δ jk I, where j and k run over x, y, z.Therefore, if we measure σ x on many qubits prepared in the same way and σ y on another set of such qubits, we can estimate σ x + σ y and ∆(σ x + σ y ) without doing any further experiments using the formula When {A, B} = cI, I do not know of any situations in which eq. ( 61) would be useful in practice, but from a theoretical point of view it is the appropriate generalization of eq. ( 60) to quantum mechanics, and this bolsters the case that Rcorr A,B is the appropriate quantum generalization of the classical correlation.

Nonlinear Functions
For nonlinear functions f (A, B) of two random variables A and B, it is common to use a first order Taylor expansion of f (A, B) about the point f ( A , B ) to derive an approximation for the variance [∆f (A, B)] 2 to second order in ∆A and ∆B.This yields the formula To avoid cluttering notation, I will write Ā for A = A , so that we can more compactly write When A and B are independent, this reduces to which is the most commonly used form.The quantum generalization of eq. ( 63) is as follows.
where ≈ means equality to second order in ∆A and ∆B Proof.Consider the first order Taylor expansion of f (A, B) about the point f 0 = f ( A , B ), Applying proposition 6.1 to this gives Taking the inner product of this with itself gives where Ā is shorthand for As a formula for propagating uncertainty, eq. ( 64) inherits all of the problems of eq. ( 61), but the problems are compounded further by use of the first order Taylor approximation.This approximation is valid when ∆A and ∆B are suitably small compared to A , B , f ( A , B ) and the derivatives of f (A, B) at A = A , B = B .This is often the case in classical experiments where everything can be measured with a small statistical error.However, in quantum mechanics, when A and B do not commute, the (various) uncertainty relations tell us that there is necessarily a trade-off between the size of ∆A and ∆B.If one of them is small, then the other might necessarily have to be large.For example, for the Pauli observables σ x and σ y , at least one of the uncertainties must be comparable in size to 1, which is the largest possible value of σ x or σ y .
A case where the formula will work well is for a continuous variable system where ∆x ∼ ∆p ∼ √ , and x , p are large compared to √ .But this is a case where you would expect classical physics to be a good approximation anyway.
I do not know whether there is a practical use of eq. ( 64), but it is nonetheless a correct formal generalization of eq. ( 63).

Dealing with Mixed States
So far, we have dealt exclusively with the case of pure state vectors |ψ .However, all of our results can be extended to more general density operators ρ, which can represent mixed states.The most familiar way to do this is to make use of the concept of a purification of a density operator.Given a density operator on a Hilbert space H S , where S stands for "system", we can always find a pure state vector |ψ SE ∈ H S ⊗ H E , where E is the "environment", such that ρ S = Tr E (|ψ ψ| SE ) , and Tr E is the partial trace over H E .You can then apply the Aharonov-Vaidman identity to operators of the form A S ⊗ I E acting on a purification to obtain results about the density operator ρ S .However, to make the parallels to the pure state case as close as possible, I prefer to use an equivalent concept, called an amplitude operator.The equivalence between amplitude operators and purifications is discussed in appendix A Definition 8.1.Given a density operator ρ S on a Hilbert space H S , an amplitude operator for ρ S is a linear operator L S : H E → H S , where H E is any Hilbert space, such that The reason for the name amplitude operator is that, in pure-state quantum mechanics, an amplitude is a complex number α such that |α| 2 is a probability.A density operator is a noncommutative generalization of a probability distribution [42,43], and hence an amplitude operator ought to be an operator that "squares" to a density operator.
Given a density operator ρ S , one obvious way of constructing an amplitude operator is to set H E = H S and L S = √ ρ S , but there are an infinite number of alternatives, as the following proposition shows Going back to the analogy between amplitudes and amplitude operators, multiplying an amplitude α by a phase factor e iφ does not change the probability it represents.Similarly, multiplying an amplitude operator L S by a semi-unitary V E|E , i.e. an operator V E|E : H E → H E satisfying V E|E V † E|E = I E , on the right does not change the density operator it represents.Although one might think it desirable to work directly with probabilities or density operators in order to eliminate these ambiguities, the mathematical manipulations we need to do in quantum mechanics are often linear in the amplitudes or amplitude operators, but would be nonlinear if you used probabilities or density operators.Therefore, it is often more convenient to live with the ambiguity.
Since every operator has a polar decomposition, the only requirement for L S to be an amplitude operator for some density operator is that Tr S L S L † S = 1.If we want to work with unnormalized density operators, i.e. any positive operator, then any operator L S : H E → H S is the amplitude operator for some (possibly unnormalized) density operator.This is analogous to the fact that any vector in H S represents a (possibly unnormalized) pure state.
The strategy for generalizing the Aharonov-Vaidman identity, and everything that follows from it, is to replace the state vector |ψ S with an amplitude operator L S .The reason this works is that the space of linear operators mapping H E to H S , which we denote L S|E , is itself a Hilbert space with inner product L S , M S = Tr E L † S M S , known as the Hilbert-Schmidt

Summary and Conclusions
In this paper, I discussed how the standard textbook uncertainty relations of Robertson and Schrödinger can be derived from the Aharonov-Vaidman identity in a more direct way than the standard proof.I also demonstrated the identity's usefulness in proving other uncertainty relations, such as the Maccone-Pati relations, and the quantum formulas for propagation of uncertainty.Finally, I gave a mixed-state generalization of the Aharonov-Vaidman identity in terms of amplitude operators.I hope that this has persuaded you that the Aharonov-Vaidman identity belongs in undergraduate textbooks and that it ought to be a first-line tool in proving relationships between standard deviations in quantum mechanics.I am sure there are other uncertainty relations that have an elegant Aharonov-Vaidman based proofs, and I hope to find new and useful uncertainty relations that have not been discovered before via this method.The Aharonov-Vaidman identity naturally gives rise to two quantum generalizations of the correlation, corr A,B and Rcorr A,B .It would be interesting to determine whether these quantities have an operational meaning in the case where A and B do not commute.On the more formal side, perhaps there is a pseudo-probability representation of quantum mechanics, such as the Wigner function [45,46,47] or the Kirkwood-Dirac distribution [48,49,50], for which these are the correlations for observables as defined on the appropriate phase space.This might help to find uses for the propagation of error formulas in cases where the observables do not commute.

f
and h ⊥ g are (generally different) vectors orthogonal to |h and α 1 , β 1 , α 2 , β 2 are complex coefficients.This is what we do in the proof of the Aharonov-Vaidman identity with the choices |f = A |ψ , |g = B |ψ and |h = |ψ .

Figure 1 :
Figure 1: Diagram showing that there exists a unit vector ψ ⊥ such that |ψ and ψ ⊥ form an orthogonal basis for the two dimensional subspace of H spanned by |ψ and |φ .

1 2 |
ψ|[A, B]|ψ | is state dependent, but if we choose A = x, B = p, then | ψ|[A, B]|ψ | = 1 and so we get the Heisenberg relation ∆x∆p ≥ Remark 7.2.For operators A 1 , A 2 , • • • , A n and real numbers α 1 , α 2 , • • • , α n , theorem 7.1 is easily generalized to Proposition 8.2.An operator L S : H E → H S is an amplitude operator for ρ S if and only if L S = √ ρ S U S|E , where U S|E : H E → H S is a semi-unitary operator, i.e. it satisfies U S|E U † S|E = I S Proof.An operator of the form L S = √ ρ S U S|E obviously satisfies definition 8.1.For the other direction, assume L S is an amplitude operator.Like any operator, it may be decomposed in its polar decomposition L S = P S U S|E where P S is a positive semi-definite operator on H S , and U S|E : H E → H S is semi-unitary 4 .The definition of an amplitude operator then implies that ρ S = P S U S|E U † S|E P S = P 2 S , so we must have P S = √ ρ S .