Derivative of the disturbance with respect to information from quantum measurements

To study the trade-off between information and disturbance, we obtain the first and second derivatives of the disturbance with respect to information for a fundamental class of quantum measurements. We focus on measurements lying on the boundaries of the physically allowed regions in four information–disturbance planes, using the derivatives to investigate the slopes and curvatures of these boundaries and hence clarify the shapes of the allowed regions.


Introduction
In quantum theory, any measurement that provides information about a physical system also inevitably disturbs the system's state in a way that depends on the measurement's outcome. This trade-off between information and disturbance is of great interest in establishing the foundations of quantum mechanics and plays an important role in quantum information processing and communication [1] techniques, such as quantum cryptography [2][3][4][5]. Many authors [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] have therefore discussed this trade-off, using several different formulations. For example, Banaszek [7] found an inequality between the amount of information gained and the size of the state change, whereas Cheong and Lee [20] found another one between the amount of information gained and the reversibility of the state change. These inequalities have both been verified [24][25][26][27] in single-photon experiments.
Recently, we have also studied this trade-off, deriving the allowed regions in four types of information-disturbance plane [28]. These four information-disturbance pairs combine one information measure, namely the Shannon entropy [6] or estimation fidelity [7], with one disturbance measure, namely the operation fidelity [7] or physical reversibility [29]. The boundaries of the allowed regions give upper and lower bounds on the information for a given disturbance, together with the optimal measurements that saturate the upper bounds. The optimal measurements are different for each of the four pairs, because the allowed regions' upper boundaries have different curvatures on each of the information-disturbance planes [28].
Contrary to expectations, the allowed regions show that measurements providing more information do not necessarily cause larger disturbances. This is because the allowed regions have finite areas, i.e., for any given measurement corresponding to an interior point of an allowed region, there always exists another measurement that provides more information with smaller disturbance near that point. However, measurements that lie on the boundary of an allowed region in the information-disturbance plane are subject to a trade-off, meaning that modifying them to increase the information obtained by moving along the boundary also increases the disturbance according to the boundary's slope.
In this paper, we obtain the first and second derivatives of the disturbance with respect to the information obtained from measurements lying on the allowed regions' boundaries for each of the four information-disturbance pairs. These measurements are described by a diagonal operator with a continuous parameter and applied to a d-level system in a completely unknown state. For such measurements, we calculate these derivatives to demonstrate the slopes and curvatures of the allowed regions' boundaries, clarifying the regions' shapes and, hence, broadening our perspective on the trade-off between information and disturbance in quantum measurements. In fact, it was difficult to judge from the allowed regions shown in Ref. [28] whether the slopes of the boundaries are finite and whether the curvatures of the boundaries are negative at some points. In contrast, the first and second derivatives obtained in this paper give the values of the slopes and curvatures of the boundaries to answer these questions.
The rest of this paper is organized as follows: Sect. 2 reviews the procedure for quantifying the information and the disturbance in quantum measurements, giving their explicit forms for a fundamental class of measurements as functions of a certain parameter. Section 3 presents the first and second derivatives of the information and the disturbance for such measurements with respect to this parameter, while Sect. 4 gives the first and second derivatives of the disturbance with respect to the information. Finally, Sect. 5 summarizes our results.

Information and disturbance
In this section, we recall the information and the disturbance in quantum measurements at the single-outcome level [11,[30][31][32][33] and summarize the results of Ref. [28] in order for this paper to be self-contained. Suppose we want to measure a d-level system that is known to be in one of a predefined set of pure states {|ψ(a) }, the probability of the system being in the state |ψ(a) is given by p(a), but we do not know the actual states of the system. To study the case where no prior information about the system is available, we assume that the set {|ψ(a) } consists of all possible pure states and p(a) is uniform according to a normalized invariant measure over the pure states.
First, we quantify the amount of information provided by a given quantum measurement [28]. An ideal quantum measurement [34] can be described by a set of measurement operators where m denotes the outcome of the measurement andÎ is the identity operator.
When the system is in state |ψ(a) , a measurement {M m } yields the outcome m with probability and changes the state to The measurement outcome provides some information about the system's state. For example, given the outcome m, the probability that the initial state was |ψ(a) is given by using Bayes' rule, where is the total probability of the outcome m. This therefore changes the state probability distribution from { p(a)} to { p(a|m)}, decreasing the Shannon entropy by This entropy change, I (m), quantifies the amount of information provided by a measurement {M m } with outcome m [11,35] and satisfies where Note that I (m) is a measure of the information generated by a single outcome, unlike which was discussed in Ref. [6]. The measurement outcome m can also be used to estimate the system's state as |ϕ(m) , where an optimal |ϕ(m) is the eigenvector ofM † mMm corresponding to its maximum eigenvalue [7]. The quality of this estimate can be evaluated in terms of the estimation fidelity G(m): This also quantifies the amount of information provided by the outcome m and satisfies Again, note that G(m) relates to a single outcome, unlike which was discussed in Ref. [7]. Next, we quantify the degree of disturbance caused by the measurement {M m } [28]. The outcome m changes the system's state from |ψ(a) to |ψ(m, a) , given by Eq. (3). The size of this change can be evaluated using the operation fidelity F(m): This quantifies the degree of disturbance caused when a measurement {M m } yields the outcome m and satisfies Again, note that F(m) relates to a single outcome, unlike which was discussed in Ref. [7]. In addition to the size of the state change, the reversibility of the change can also be used to quantify the disturbance in the context of physically reversible measurements [36][37][38][39][40][41][42][43][44][45][46]. Even though |ψ(a) and |ψ(m, a) are unknown, the change can be physically reversed by a reversing measurement on |ψ(m, a) ifM m has a bounded left inverseM −1 m [39,40]. Such a reversing measurement can be described by another set of measurement operators andR (m) μ 0 ∝M −1 m for a particular μ = μ 0 , where μ denotes the reversing measurement's outcome. When this measurement on |ψ(m, a) yields the preferred outcome μ 0 , the system's state returns to |ψ(a) becauseR (m) μ 0M m ∝Î . The state recovery probability for an optimal reversing measurement [29] is and we can use this to evaluate the reversibility of the state change as This also quantifies the degree of disturbance caused when a measurement {M m } yields the outcome m and satisfies Again, note that R(m) relates to a single outcome, unlike which was discussed in Refs. [20,29].
As an important example, we consider a diagonal measurement operatorM The information that was yielded and the disturbance that was caused by this operator can be quantified in terms of I (m), G(m), F(m), and R(m), given by Eqs. (6), (10), (13), and (18) as functions of the parameter λ. Using the general formula derived in Ref. [33], I (m) can be calculated to be where J is given by with coefficients for n = 0, 1, . . . , j. Likewise, G(m), F(m), and R(m) can be calculated to be [33] The measurement operatorM k,l (λ) is very important for obtaining the allowed regions in the information-disturbance planes by plotting all physically possible measurement operators. We consider four different allowed regions, based on using I (m) or G(m) to quantify the information and F(m) or R(m) to quantify the disturbance. Figure 1 shows these four allowed regions for d = 4 in gray [28], where the lines (k, l) correspond toM (d) k,l (λ) with 0 ≤ λ ≤ 1 and the P r 's denote the points corresponding to the projective measurement operator of rank r : Thus, the line (k, l) connects P k to P k+l and the point P d is at the top left corner of the plot. In Fig. 1, the upper boundaries of the allowed regions consist of the lines (1, d − 1) corresponding toM whereas the lower boundaries consist of the lines (k, 1) corresponding toM (d) k,1 (λ) for k = 1, 2, . . . , d − 1. Therefore, to find the values of the slopes and curvatures of the boundaries, we need to calculate the first and second derivatives of the disturbance with respect to information forM The above allowed regions were obtained by considering ideal measurements, as in Eq. (3), with optimal estimates for G(m). Unfortunately, the lower boundaries can be violated by non-ideal measurements, which yield mixed post-measurement states due to classical noise, or non-optimal estimates, which make suboptimal choices for |ϕ(m) . Here, we ignore such non-quantum effects in order to focus on the quantum nature of measurement.

Derivatives with respect to 2
To calculate the derivative of the disturbance with respect to information forM (d) k,l (λ), we first consider the derivatives of the information and disturbance with respect to the parameter λ 2 . For simplicity, we focus on derivatives with respect to λ 2 rather than λ itself. These derivatives are straightforward to calculate because the information and the disturbance are expressed as functions of λ = √ λ 2 in Eqs. (23), (27), (28), and (29).
However, the expression for the derivative of I (m) is quite long. This is due to the expression for J given in Eq. (24). From Eq. (23), the first derivative of I (m) is where primes represent derivatives with respect to λ 2 . The first derivative of J can be written as  and the second derivative of J can be written as Figure 3 shows [I (m)] as a function of λ for d = 4, for various (k, l). From this, we can observe that [I (m)] > 0. As shown in "Appendix A," at λ = 0, J and its derivatives become where a ( j) j+1 is given by instead of Eq. (25). Here, J in Eq. (36) diverges for k = 1 because Similarly, at λ = 1, J and its derivatives become Likewise, from Eqs. (27), (28), and (29), the first derivatives of G(m), F(m), and R(m) are In addition, the second derivatives of G(m), F(m), and R(m) are   Table 1. These signs mean that when λ 2 is increased, I (m) and G(m) decrease, while F(m) and R(m) increase. There is a trade-off between the information and the disturbance forM (d) k,l (λ).

Derivatives with respect to information
Using the derivatives of the information and disturbance with respect to λ 2 , we can now calculate the derivative of the disturbance with respect to information forM k,l (λ). Let f and g be arbitrary functions of λ. Given the derivatives of f and g with respect to λ 2 , the first and second derivatives of f with respect to g are The same results can be obtained using derivatives with respect to λ.
Figures 4a and 5a show these derivatives as functions of G(m) (Eq. 27) for d = 4, for various (k, l). Because λ = 0 corresponds to P k and λ = 1 corresponds to P k+l for the lines (k, l) in Fig. 1, the derivatives become    However, by applying L'Hôpital's rule and considering higher derivatives, we can find that at P k+l , as shown in "Appendix C." The first derivative of F(m) with respect to I (m) (Fig. 4c) is negative, and the second derivative (Fig. 5c) is always negative if k ≥ l but can be positive near P k+l if k < l. This means that the lines (k, l) in Fig. 1c are monotonically decreasing convex curves if k ≥ l, but monotonically decreasing S-shaped curves if k < l. In particular, even though it is difficult to see from Fig. 1c, the upper boundary (1, d − 1) has a slight dent near P d when d ≥ 3 [28]. Finally, from Eqs. (31), (34), (46), and (50), the first and second derivatives of R(m) with respect to I (m) are at P k+l . In Eq. (68), the second derivative diverges for k = 1 because of the corresponding result in Eq. (40), and the divergences seen in Eq. (69) likewise come from Eq. (42). Note that because [I (m)] tends to zero from below, as shown in Fig. 2. The first derivative of R(m) with respect to I (m) (Fig. 4d) is negative and the second derivative (Fig. 5d) is positive, which means that all the lines (k, l) in Fig. 1d are monotonically decreasing concave curves. The signs of the derivatives for the four information-disturbance pairs are summarized in Table 2. All the first derivatives have negative signs, which implies that there is a trade-off between the information and the disturbance for each of the four pairs. In contrast, the second derivatives have different signs, which implies that the optimal measurements are different for each of the four pairs [28].

Conclusion
In this paper, we have obtained the first and second derivatives of the disturbance with respect to information for a class of quantum measurements described by the measurement operatorM (d) k,l (λ) (Eq. 22). When the measurement performed on a dlevel system in a completely unknown state yields a single outcome m, the information is quantified by the Shannon entropy I (m) (Eq. 23) and the estimation fidelity G(m) (Eq. 27), while the disturbance is quantified by the operation fidelity F(m) (Eq. 28) and the physical reversibility R(m) (Eq. 29). In these four information-disturbance planes, k,l (λ) with 0 ≤ λ ≤ 1 corresponds to a line (k, l), as shown in Fig. 1. In particular, the lines (1, d − 1) and (k, 1) form the boundaries of the allowed regions obtained by plotting all physically possible measurement operators in these planes [28].
The slope and curvature of each line (k, l) are given by the first and second derivatives of the disturbance with respect to the information forM  Table 2 for a summary of the signs of the derivatives.
Based on these results, we can see that the boundaries (1, d − 1) and (k, 1) of the allowed regions have non-positive slopes for all four information-disturbance pairs, indicating that there is a trade-off between the information and the disturbance for measurements on their boundaries. When the information is increased by moving along a boundary, the disturbance also increases, decreasing F(m) and R(m). In addition, the rate of change of the disturbance with respect to information is given by the boundary's slope. (71) Figure 4a shows that |dF(m)/dG(m)| is infinitely large near P 1 , but almost zero near P d . In contrast, the curvatures of the boundaries (1, d − 1) and (k, 1) for the four information-disturbance pairs have different signs. This means that the allowed regions are extended in different ways when the information and disturbance are averaged over all possible outcomes, as with I , G, F, and R, given by Eqs. (9), (12), (15), and (20), because the allowed regions for the average values are the convex hulls of those for a single outcome [28]. The upper boundaries of the allowed regions for the average values correspond to the optimal measurements that saturate the upper information bounds for a given disturbance. Consequently, the optimal measurements are different for each of the four information-disturbance pairs [28].
at λ = 0, as given in Eq. (36). Similarly, the second derivative of J (Eq. 35) can be shown to be if k ≥ 2 by using the identity which can be derived from the terms of order k−1 in However, if k = 1, J contains c (l+1) l+1 (λ), which diverges in the limit as λ → 0, as shown by Eq. (38). By combining these results, we find that J is given by Eq. (36) at λ = 0.

B Limits as → 1
Here, we show that the first and second derivatives of J with respect to λ 2 are as given in Eq. (41) in the limit as λ → 1. To find the derivatives at λ = 1, we first obtain the Taylor series for J around λ = 1 by substituting λ 2 = 1 − into Eq. (24): Note that the terms with negative powers of cancel each other out in this expansion because J is finite, even at λ = 1 [33]. The coefficients { j n } are related to the derivatives of J at λ = 1 by In Appendix C of Ref. [33], j 0 was shown to be a (k+l) k+l−1 , as given in Eq. (41), and the other coefficients can be handled similarly.
For example, by applying Eq. (77) to c ( j) n ( √ 1 − ), j 1 can be given as The expression in the square brackets satisfies from Eq. (25). By using the identity which can be derived from the terms of order l−1 in we find that j 1 is Therefore, from Eq. (84), we see that J is given by Eq. (41) at λ = 1. Similarly, j 2 is given by which can be derived from the terms of order l−1 in we find that j 2 is Therefore, from Eq. (84), we see that J is given by Eq. (41) at λ = 1.
In general, we can use a similar argument to find that j n is which shows that the nth derivative of J at λ = 1 is given by

C Derivative calculations using L'Hôpital's rule
Here, we show that the first and second derivatives of F(m) with respect to I (m) are as given in Eqs. (63) and (64), respectively, in the limit as λ → 1. We need to apply L'Hôpital's rule to find these derivatives, because Eqs. (58) and (59) yield the indeterminate form 0/0 in the limit as λ → 1, due to Eqs.
Substituting these derivatives into Eq. (101) allows us to show Eq. (64) for k = l.