Derivative of the disturbance with respect to information from quantum measurements

To study the trade-off between information and disturbance, we obtain the first and second derivatives of the disturbance with respect to information for a fundamental class of quantum measurements. We focus on measurements lying on the boundaries of the physically allowed regions in four information--disturbance planes, using the derivatives to investigate the slopes and curvatures of these boundaries and hence clarify the shapes of the allowed regions.


Introduction
In quantum theory, any measurement that provides information about a physical system also inevitably disturbs the system's state in a way that depends on the measurement's outcome. This trade-off between information and disturbance is of great interest in establishing the foundations of quantum mechanics and plays an important role in quantum information processing and communication [1] techniques, such as quantum cryptography [2][3][4][5]. Many authors [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] have therefore discussed this trade-off, using several different formulations. For example, Banaszek [7] found an inequality between the amount of information gained and the size of the state change, whereas Cheong and Lee [20] found one between the amount of information gained and the reversibility of the state change. These inequalities have both been verified [24][25][26][27] in single-photon experiments.
Recently, we have also studied this trade-off, deriving the allowed regions in four types of information-disturbance plane [28]. These four informationdisturbance pairs combine one information measure, namely the Shannon entropy [6] or estimation fidelity [7], with one disturbance measure, namely the operation fidelity [7] or physical reversibility [29]. The boundaries of the allowed regions give upper and lower bounds on the information for a given disturbance, together with the optimal measurements that saturate the upper bounds. The optimal measurements are different for each of the four pairs, because the allowed regions' upper boundaries have different curvatures on each of the information-disturbance planes [28].
Contrary to expectations, the allowed regions show that measurements providing more information do not necessarily cause larger disturbances. This is because the allowed regions have finite areas, i.e., for any given measurement corresponding to an interior point of an allowed region, there always exists another measurement that provides more information with smaller disturbance near that point. However, measurements that lie on the boundary of an allowed region in the information-disturbance plane are subject to a tradeoff. Meaning that, modifying them to increase the information obtained by moving along the boundary also increases the disturbance according to the boundary's slope.
In this paper, we obtain the first and second derivatives of the disturbance with respect to the information obtained from measurements lying on the allowed regions' boundaries for each of the four information-disturbance pairs. These measurements are described by a diagonal operator with a continuous parameter, and applied to a d-level system in a completely unknown state. For such measurements, we calculate these derivatives to demonstrate the slopes and curvatures of the allowed regions' boundaries, clarifying the regions' shapes and hence, broadening our perspective on the trade-off between information and disturbance in quantum measurements.
We here focus on the information and disturbance pertaining to a single outcome, not averaged over all possible outcomes [6,7,9,10,16,18,20], to study the trade-off at the single-outcome level [11,[30][31][32][33]. This trade-off is strongly implied by the existence of physically reversible measurements [34][35][36][37][38][39][40][41][42][43][44]. In fact, with physically reversible measurements, a second measurement that recovers the system's state prior to the first measurement always erases all information obtained by the first measurement [35,38]. This state recovery with information erasure occurs not on average but only when the second measurement yields a single preferred outcome, which implies that there is a trade-off at the single-outcome level.
The rest of this paper is organized as follows. Section 2 reviews the procedure for quantifying the information and the disturbance in quantum measurements, giving their explicit forms for a fundamental class of measurements as functions of a certain parameter. Section 3 presents the first and second derivatives of the information and the disturbance for such measurements with respect to this parameter, while Section 4 gives the first and second derivatives of the disturbance with respect to the information. Finally, Section 5 summarizes our results.

Information and Disturbance
First, we quantify the amount of information provided by a given quantum measurement [28]. Suppose we want to measure a d-level system that is known to be in one of a predefined set of pure states {|ψ(a) }, the probability of the system being in the state |ψ(a) is given by p(a), but we do not know the actual states of the system. To study the case where no prior information about the system is available, we assume that the set {|ψ(a) } consists of all possible pure states, and p(a) is uniform according to a normalized invariant measure over the pure states.
Next, we establish the information about the system's state. An ideal quantum measurement [45] can be described by a set of measurement oper- where m denotes the outcome of the measurement andÎ is the identity operator. When the system is in state |ψ(a) , a measurement {M m } yields the outcome m with probability and changes the state to |ψ(m, a) = 1 The measurement outcome provides some information about the system's state. For example, given the outcome m, the probability that the initial state was |ψ(a) is given by using Bayes's rule, where is the total probability of the outcome m. This therefore changes the state probability distribution from {p(a)} to {p(a|m)}, decreasing the Shannon entropy by This entropy change, I(m), quantifies the amount of information provided by a measurement {M m } with outcome m [11,46], and satisfies where η(n) = which was discussed in Ref. [6]. The measurement outcome m can also be used to estimate the system's state as |ϕ(m) , where an optimal |ϕ(m) is the eigenvector ofM † mM m corresponding to its maximum eigenvalue [7]. The quality of this estimate can be evaluated in terms of the estimation fidelity G(m): This also quantifies the amount of information provided by the outcome m, and satisfies Again, note that G(m) relates to a single outcome, unlike which was discussed in Ref. [7]. Next, we quantify the degree of disturbance caused by the measurement {M m } [28]. The outcome m changes the system's state from |ψ(a) to |ψ(m, a) , given by Eq. (3). The size of this change can be evaluated using the operation fidelity F (m): This quantifies the degree of disturbance caused when a measurement {M m } yields the outcome m, and satisfies Again, note that F (m) relates to a single outcome, unlike which was discussed in Ref. [7]. In addition to the size of the state change, the reversibility of the change can also be used to quantify the disturbance. Even though |ψ(a) and |ψ(m, a) are unknown, the change can be physically reversed by a reversing measurement on |ψ(m, a) ifM m has a bounded left inverseM −1 m [35,36]. Such a reversing measurement can be described by another set of measurement operators {R andR (m) µ 0 ∝M −1 m for a particular µ = µ 0 , where µ denotes the reversing measurement's outcome. When this measurement on |ψ(m, a) yields the preferred outcome µ 0 , the system's state returns to |ψ(a) becauseR (m) µ 0Mm ∝ Î . The state recovery probability for an optimal reversing measurement [29] is and we can use this to evaluate the reversibility of the state change as This also quantifies the degree of disturbance caused when a measurement {M m } yields the outcome m, and satisfies Again, note that R(m) relates to a single outcome, unlike which was discussed in Refs. [20,29].
As an important example, we consider a diagonal measurement operator M k,l (λ) can be written aŝ The information that was yielded and disturbance that was caused by this operator can be quantified in terms of I(m), G(m), F (m), and R(m), given by Eqs. (6), (10), (13), and (18) as functions of the parameter λ. Using the general formula derived in Ref. [33], I(m) can be calculated to be where J is given by with coefficients for n = 0, 1, . . . , j. Likewise, G(m), F (m), and R(m) can be calculated to be [33] The measurement operatorM k,l (λ) is very important for obtaining the allowed regions in the information-disturbance planes by plotting all physically possible measurement operators. We consider four different allowed regions, based on using I(m) or G(m) to quantify the information and F (m) or R(m) to quantify the disturbance. Figure 1 shows these four allowed regions for d = 4 in gray [28], where the lines (k, l) correspond toM k,l (λ) with 0 ≤ λ ≤ 1 and the P r 's denote the points corresponding to the projective measurement operator of rank r: Clearly,M Thus, the line (k, l) connects P k to P k+l and the point P d is at the top left corner of the plot.  k,1 (λ) for k = 1, 2, . . . , d − 1. Therefore, to find the slopes and curvatures of the boundaries, we need to calculate the first and second derivatives of the disturbance with respect to information forM The above allowed regions were obtained by considering ideal measurements, as in Eq. (3), with optimal estimates for G(m). Unfortunately, the lower boundaries can be violated by non-ideal measurements, which yield mixed post-measurement states due to classical noise, or non-optimal estimates, which make suboptimal choices for |ϕ(m) . Here, we ignore such non-quantum effects in order to focus on the quantum nature of measurement.
3 Derivatives with respect to λ 2 To calculate the derivative of the disturbance with respect to information for M (d) k,l (λ), we first consider the derivatives of the information and disturbance with respect to the parameter λ 2 . For simplicity, we focus on derivatives with respect to λ 2 rather than λ itself. These derivatives are straightforward to calculate because the information and the disturbance are expressed as functions of λ = √ λ 2 in Eqs. (23), (27), (28), and (29). However, the expression for the derivative of I(m) is quite long. This is due to the expression for J given in Eq. (24). From Eq. (23), the first derivative of I(m) is where primes represent derivatives with respect to λ 2 . The first derivative of J can be written as  and the second derivative of J can be written as where a (j) j+1 is given by a Similarly, at λ = 1, J and its derivatives become Likewise, from Eqs. (27), (28), and (29), the first derivatives of G(m), F (m), and R(m) are In addition, the second derivatives of G(m), F (m), and R(m) are

Derivatives with respect to information
Using the derivatives of the information and disturbance with respect to λ 2 , we can now calculate the derivative of the disturbance with respect to information forM k,l (λ). Let f and g be arbitrary functions of λ. Given the derivatives of f and g with respect to λ 2 , the first and second derivatives of f with respect to g are The same results can be obtained using derivatives with respect to λ.

Figures 4(a) and 5(a) show these derivatives as functions of G(m) [Eq. (27)]
for d = 4, for various (k, l). Because λ = 0 corresponds to P k and λ = 1 corresponds to P k+l for the lines (k, l) in Fig. 1, the derivatives become  However, by applying L'Hôpital's rule and considering higher derivatives, we can find that at P k+l , as shown in Appendix C. The first derivative of F (m) with respect to I(m) [Fig. 4(c)] is negative, and the second derivative [ Fig. 5(c)] is always negative if k ≥ l but can be positive near P k+l if k < l. This means that the lines (k, l) in Fig. 1(c) are monotonically-decreasing convex curves if k ≥ l but monotonically-decreasing S-shaped curves if k < l. In particular, even though it is difficult to see from Fig. 1(c), the upper boundary (1, d − 1) has a slight dent near P d when d ≥ 3 [28]. Finally, from Eqs. (31), (34), (46), and (50), the first and second derivatives of R(m) with respect to I(m) are When k + l = d, they become

Conclusion
In this paper, we have obtained the first and second derivatives of the disturbance with respect to information for a class of quantum measurements described by the measurement operatorM k,l (λ) with 0 ≤ λ ≤ 1 corresponds to a line (k, l), as shown in Fig. 1. In particular, the lines (1, d − 1) and (k, 1) form the boundaries of the allowed regions obtained by plotting all physically possible measurement operators in these planes [28].
The slope and curvature of each line (k, l) are given by the first and second derivatives of the disturbance with respect to the information forM  Fig. 4), while the second derivatives are given by Eqs. (53), (57), (59), and (66) (shown for d = 4 in Fig. 5). For the derivative of F (m) with respect to G(m), all the lines (k, l) in Fig. 1(a) are monotonically-decreasing convex curves, because the first and second derivatives are non-positive and negative, respectively, as shown in Figs. 4(a) and 5(a). For the derivative of R(m) with respect to G(m), all the lines (k, l) in Fig. 1(b) are monotonically-decreasing straight lines, because the first and second derivatives are negative and zero, respectively, as shown in Figs. 4(b) and 5(b). For the derivative of F (m) with respect to I(m), the lines (k, l) in Fig. 1(c) are monotonically-decreasing convex curves if k ≥ l and monotonically-decreasing S-shaped curves if k < l, because the first derivative is negative and the second derivative is always negative if k ≥ l but can be positive near P k+l if k < l, as shown in Figs. 4(c) and 5(c). Finally, for the derivative of R(m) with respect to I(m), all the lines (k, l) in Fig. 1(d) are monotonically-decreasing concave curves, because the first and second derivatives are negative and positive, respectively, as shown in Figs. 4(d) and 5(d).
Based on these results, we can see that the boundaries (1, d − 1) and (k, 1) of the allowed regions have non-positive slopes for all four informationdisturbance pairs, indicating that there is a trade-off between the information and the disturbance for measurements on their boundaries. When the information is increased by moving along a boundary, the disturbance also increases, decreasing F (m) and R(m). In addition, the rate of change of the disturbance with respect to information is given by the boundary's slope. For example, if G(m) is increased by ∆G(m), F (m) decreases by about Figure 4(a) shows that |dF (m)/dG(m)| is infinitely large near P 1 , but almost zero near P d .
In contrast, the curvatures of the boundaries (1, d − 1) and (k, 1) for the four information-disturbance pairs have different signs. This means that the allowed regions are extended in different ways when the information and disturbance are averaged over all possible outcomes, as with I, G, F , and R, given by Eqs. (9), (12), (15), and (20), because the allowed regions for the average values are the convex hulls of those for a single outcome [28]. The upper boundaries of the allowed regions for the average values correspond to the optimal measurements that saturate the upper information bounds for a given disturbance. Consequently, the optimal measurements are different for each of the four information-disturbance pairs [28].

Appendices
A Limits as λ → 0 Here, we show that the first and second derivatives of J with respect to λ 2 are as given in Eq. (36) in the limit as λ → 0. First, note that as shown in Appendix C of Ref. [33]. The limits of these derivatives can also be shown in a similar way.
because c (j) n (0) = 0 if n < j. This equation can be simplified by using the identity which can be derived from by expanding every factor as a Taylor series. In other words, the first factor in Eq. (75) can be expanded using the generalized binomial theorem while the other factors can be expanded in terms of coefficients {a (j) n } [33], where the a (j) n 's are given by Eq. (25) for n = 0, 1, . . . , j and by at λ = 0, as given in Eq. (36). Similarly, the second derivative of J [Eq. (35)] can be shown to be if k ≥ 2 by using the identity which can be derived from the terms of order ǫ k−1 in However, if k = 1, J ′′ contains c (l+1) l+1 (λ), which diverges in the limit as λ → 0, as shown by Eq. (38). By combining these results, we find that J ′′ is given by Eq. (36) at λ = 0.

B Limits as λ → 1
Here, we show that the first and second derivatives of J with respect to λ 2 are as given in Eq. (41) in the limit as λ → 1. To find the derivatives at λ = 1, we first obtain the Taylor series for J around λ = 1 by substituting λ 2 = 1 − ǫ into Eq. (24): Note that the terms with negative powers of ǫ cancel each other out in this expansion because J is finite, even at λ = 1 [33]. The coefficients {j n } are related to the derivatives of J at λ = 1 by In Appendix C of Ref. [33], j 0 was shown to be a (k+l) k+l−1 , as given in Eq. (41), and the other coefficients can be handled similarly.
For example, by applying Eq. (77) to c (j) n √ 1 − ǫ , j 1 can be given as The expression in the square brackets satisfies k + l n a (k+l−n) k+l−n + a (k+l) from Eq. (25). By using the identity which can be derived from the terms of order ǫ l−1 in we find that j 1 is Therefore, from Eq. (84), we see that J ′ is given by Eq. (41) at λ = 1. Similarly, j 2 is given by which can be derived from the terms of order ǫ l−1 in we find that j 2 is Therefore, from Eq. (84), we see that J ′′ is given by Eq. (41) at λ = 1.
In general, we can use a similar argument to find that j n is which shows that the nth derivative of J at λ = 1 is given by lim λ→1 J (n) = n!j n = (l − 1 + n)! (l − 1)! a (k+l) k+l−1+n . (96)
Substituting these derivatives into Eq. (101) allows us to show Eq. (64) for k = l.