1 Introduction

Measurement on a quantum system is of essential importance for investigating physical properties of a quantum system [14] and for extracting some information encoded on a quantum system in quantum information processing [57]. To perform quantum measurement, we first prepare a measurement apparatus in an appropriate initial state and then we make it interact with a quantum system to be measured. After the interaction between them, we obtain a value shown by a pointer observable of the apparatus. When we repeat the measurement on an ensemble of identically prepared systems, we obtain a set of values of the pointer observable, from which we can derive statistical properties of a measured observable such as an average value and fluctuation. Let us denote an initial state of the system as \(\vert \psi _{i}\rangle \) and a measured observable as \(\hat{A}\). Then the average value derived from the measurement outcomes is given by \(\langle \hat{A}\rangle =\langle \psi _{i}\vert \hat{A}\vert \psi _{i}\rangle \) if the strength of the coupling between the system and the apparatus is strong or weak. However, when a final state \(\vert \psi _{f}\rangle \) of the system is fixed, the measurement result becomes quite different. Fixing the final state is called a postselection of the system, which can be done probabilistically. The theory of a quantum system with postselection has been developed by Aharonov et al. [810] who aimed to formulate the time-symmetric quantum mechanics [8]. For a postselected system, if the system-apparatus coupling is sufficiently strong, we obtain the average value \(A_{s}=\sum _{a}a\vert \langle \psi _{f}\vert a\rangle \vert ^{2} \vert \langle a\vert \psi _{i}\rangle \vert ^{2} /\sum _{a}\vert \langle \psi _{f}\vert a\rangle \vert ^{2} \vert \langle a\vert \psi _{i}\rangle \vert ^{2}\) [9] from the measurement outcomes while if it is extremely weak, we obtain real and imaginary parts of \(A_{w}=\langle \psi _{f}\vert \hat{A}\vert \psi _{i}\rangle /\langle \psi _{f}\vert \psi _{i}\rangle \) [11], where we have assumed the spectral decomposition \(\hat{A}=\sum _{a}a\vert a\rangle \langle a\vert \) of the measured observable. The former is called the postselected strong value of \(\hat{A}\) and the latter is the weak value of \(\hat{A}\). The weak value may be complex value and beyond the spectral range of \(\hat{A}\). After finding the weak value [11], many authors have investigated the peculiar properties and the potential applications [1238].

We have three average values of an observable \(\hat{A}\); the average value \(\langle \hat{A}\rangle \), the strong value \(A_{s}\) and the weak value \(A_{w}\). It may be of great interest to investigate whether there is another average value of an observable in a postselected quantum system, which is neither the strong value nor the weak value. It is well known that a system to be measured irreversibly and probabilistically changes its quantum state due to an interaction with a measurement apparatus and readout of pointer observables [14]. The effect of measurement on the system is described by a trace non-preserving linear map \(\hat{\fancyscript{L}}_{a}\) [3942] which depends on a property of measurement that we perform on a quantum system. If the linear map \(\hat{\fancyscript{L}}_{a}\) is expressed only in terms of eigenprojectors of a measured observable, we can show that an average value is provided by a mixture of the strong and weak values of the observable. Furthermore, we will explain how to implement a measurement process for obtaining such a mixture.

This paper is organized as follows: In Sect. 2, we show that a conditional average of an observable in a postselected quantum system is given by a mixture of the strong and weak values of an observable if the effect of measurement on a system is described by a linear map which can be expressed only in terms of eigenprojectors of the measured observable. In Sect. 3, we explain how to implement a measurement process which provides the conditional average value. Furthermore, we show for a qubit system that the measurement process can be constructed by making use of a lossless beam splitter, a single-qubit unitary gate, two controlled NOT gates and readout of the Pauli matrices of a measurement apparatus. In Sect. 4, we provide a brief summary of this paper.

2 Average value in a postselected system

We consider a conditional average of an observable in a quantum system with postselection. In this paper, for the sake of simplicity, we ignore a state change caused by dynamical time evolution of a quantum system between the initial preparation and the postselection. We suppose that a system is initially prepared in a quantum state which is represented by a density operator \(\hat{\rho }_{i}\). At later time, we perform a postselection of the system by making use of projective or generalized measurement which is described by positive operator-valued measure [43, 44], a set of measurement operators \(\{\hat{M}_{1},\hat{M}_{2},\ldots ,\hat{M}_{f},\ldots \}\) with positivity \(\hat{M}_{f}\ge 0\) and completeness \(\sum _{f}\hat{M}_{f}=\hat{1}\), where we denote an identity operator as \(\hat{1}\). The success probability of the postselection \(f\) is given by

$$\begin{aligned} P(f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}], \end{aligned}$$
(1)

where \(\mathrm {Tr}\) stands for the trace operation over a Hilbert space of the system. Next we suppose that an observable \(\hat{A}\) of the system is measured between the initial preparation and the postselection of the system. A spectral decomposition \(\hat{A}\) is given by \(\hat{A}=\sum _{a}a\hat{\Pi }_{a}\), where \(a\) is an eigenvalue and \(\hat{\Pi }_{a}\) is a projection operator into an eigenspace associated with the eigenvalue \(a\). For the sake simplicity, we assume that the observable has a discrete spectrum. We denote as \(P(a,f\vert i)\) the joint probability that the measurement outcome \(a\) is obtained and the postselection \(f\) is performed on the system for the given initial state \(\hat{\rho }_{i}\). If the postselection is not performed, the probability of obtaining the measurement outcome \(a\) is given by

$$\begin{aligned} P(a\vert i)=\mathrm {Tr}[\hat{\Pi }_{a}\hat{\rho }_{i}], \end{aligned}$$
(2)

which is a marginal of the joint probability, that is, \(P(a\vert i)=\sum _{f}P(a,f\vert i)\). However, the probability \(P(f\vert i)\) given by Eq. (1) is not a marginal of the joint probability \(P(a,f\vert i)\) due to the effect of the back action of the measurement performed before the postselection. Using the Bayes theorem [45], we find the conditional probability that the measurement outcome \(a\) is obtained for the given initial state \(\hat{\rho }_{i}\) and postselection \(f\) of the system,

$$\begin{aligned} P(a\vert f,i)=\frac{P(a,f\vert i)}{\sum _{a}P(a,f\vert i)}. \end{aligned}$$
(3)

Then the conditional average of the observable \(\hat{A}\) is given by

$$\begin{aligned} A_{fi}=\sum _{a}aP(a\vert f,i) =\frac{\sum _{a}aP(a\vert f,i)}{\sum _{a}P(a\vert f,i)}. \end{aligned}$$
(4)

In the following, we will show that under certain conditions, the conditional average \(A_{fi}\) can be expressed as a mixture of the strong value and the weak value of the observable \(\hat{A}\).

We obtain the joint probability under certain conditions. First we suppose that the initial state \(\hat{\rho }_{i}\) of the system is a convex combination of two density operators, say, \(\hat{\rho }_{i}=p_{1}\hat{\rho }_{1}+p_{2}\hat{\rho }_{2}\) with \(p_{k}\ge 0\) (\(k=1,2\)) and \(p_{1}+p_{2}=1\). It is reasonable to consider that the joint probability \(P(a,f\vert i)\) satisfies the equality \(P(a,f\vert i)=p_{1}P(a,f\vert 1)+p_{2}P(a,f\vert 2)\). Furthermore, when the system is postselected by the measurement operators \(\hat{M}_{1}\) or \(\hat{M}_{2}\), the equality \(P(a,f\vert i)=P(a,1\vert i)+P(a,2\vert i)\) should be established since \(\hat{M}_{f}=\hat{M}_{1}+\hat{M}_{2}\). These imply that the joint probability \(P(a,f\vert i)\) is bilinear with respect to \(\hat{\rho }_{i}\) and \(\hat{M}_{f}\). Thus we assume that the joint probability \(P(a,f\vert i)\) is given by

$$\begin{aligned} P(a,f\vert i)=\mathrm {Tr}[\hat{\fancyscript{L}}_{a}(\hat{M}_{f},\hat{\rho }_{i})], \end{aligned}$$
(5)

where \(\hat{\fancyscript{L}}_{a}\) is a linear map (or a superoperator) which characterizes the effect of the measurement of \(\hat{A}\). If we do not perform the measurement on the system, the map \(\hat{\fancyscript{L}}_{a}\) should be \(\hat{\fancyscript{L}}_{a}(\hat{M}_{f},\hat{\rho }_{i}) =\hat{M}_{f}\hat{\rho }_{i}\) so that Eq. (5) reduces to Eq. (1). In the rest of this paper, we consider the special case that the map \(\hat{\fancyscript{L}}_{a}\) can be expressed only in terms of the eigenprojectors \(\hat{\Pi }_{a}\) of the measured observable \(\hat{A}\). Since the equality \(\hat{\Pi }_{a}^{2}=\hat{\Pi }_{a}\) holds for the eigenprojector, we find that \(\hat{\fancyscript{L}}_{a}(\hat{M}_{f},\hat{\rho }_{i})\) is a linear combination of \(\hat{\Pi }_{a}^{\ell }\hat{M}_{f}\hat{\Pi }_{a}^{m}\hat{\rho }_{i}\hat{\Pi }_{a}^{n}\) (\(\ell ,m,n=0,1\)). Then considering the property of the trace, we can write the joint probability in the following form:

$$\begin{aligned} P(a,f\vert i)=s\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i} \hat{\Pi }_{a}]+\frac{1}{2}w\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a} \hat{\rho }_{i}]+\frac{1}{2}u\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i} \hat{\Pi }_{a}], \end{aligned}$$
(6)

where we have neglected the term \(\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}]\) since it does not include the effect of the measurement. In Eq. (6), we have assumed that the expansion coefficients \(s\), \(w\) and \(u\) are independent of \(a\). Since the joint probability \(P(a,f\vert i)\) should be real, we have the equalities \(u=w^{*}\) and \(s^{*}=s\). Furthermore, using the fact that the equality \(\sum _{f}P(a,f\vert i)=P(a \vert i)=\mathrm {Tr}[\hat{\Pi }_{a}\hat{\rho }_{i}]\) holds, we derive the condition for the parameters \(s\) and \(w\),

$$\begin{aligned} s+\mathrm {Re}w=1. \end{aligned}$$
(7)

When we consider the non-selective measurement of \(\hat{A}\), the success probability of the postselection \(f\) is given by

$$\begin{aligned} P_{\mathrm {non}}(f\vert i)=\sum _{a}P(a,f\vert i) =s\sum _{a}\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i} \hat{\Pi }_{a}]+(\mathrm {Re}w)\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}], \end{aligned}$$
(8)

where we have used the completeness relation \(\sum _{a}\hat{\Pi }_{a}=\hat{1}\). It should be noted that the post-measurement state of the system after the non-selective measurement can be written in the form of \(\hat{\rho }_{\mathrm {non}}=\mathrm {Tr}_{M}[\hat{U}(\hat{\rho }_{i} \otimes \hat{\rho }_{M})\hat{U}^{\dagger }]\), where \(\hat{U}\) is a unitary operator of a system-apparatus interaction, \(\hat{\rho }_{M}\) is an initial state of a measurement apparatus and \(\mathrm {Tr}_{M}\) stands for the trace operation over a Hilbert space of the apparatus. Since the probability \(P_{\mathrm {non}}(f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{\mathrm {non}}]\) is non-negative, we find that the parameters \(s\) and \(w\) satisfy

$$\begin{aligned} s\ge 0, \quad \mathrm {Re}w\ge 0. \end{aligned}$$
(9)

Therefore, we finally obtain the joint probability of the measurement outcome \(a\) and the postselection \(f\),

$$\begin{aligned} P(a,f\vert i)= & {} s\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a} \hat{\rho }_{i}\hat{\Pi }_{a}] +\frac{1}{2}w\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}] +\frac{1}{2}w^{*}\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}\hat{\Pi }_{a}]\nonumber \\= & {} s\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a} \hat{\rho }_{i}\hat{\Pi }_{a}] +\frac{1}{2}(\mathrm {Re}w) \mathrm {Tr}[\hat{M}_{f}\{\hat{\Pi }_{a},\hat{\rho }_{i}\}] +\frac{i}{2}(\mathrm {Im}w) \mathrm {Tr}[\hat{M}_{f}[\hat{\Pi }_{a},\hat{\rho }_{i}]], \end{aligned}$$
(10)

with anti-commutator \(\{\hat{A},\hat{B}\}=\hat{A}\hat{B}+\hat{B}\hat{A}\) and commutator \([\hat{A},\hat{B}]=\hat{A}\hat{B}-\hat{B}\hat{A}\). Noted that \(P(a,f\vert i)\) is not positive though it satisfies the normalization condition \(\sum _{f}\sum _{a}P(a,f\vert i)=1\). Hence it is a quasi-probability. This will be discussed later.

Substituting Eq. (10) into Eq. (3), we find the conditional quasi-probability \(P(a\vert f,i)\) that the measurement outcome \(a\) is obtained for the given initial state \(\hat{\rho }_{i}\) and postselection \(f\) of the system,

$$\begin{aligned} P(a\vert f,i) =\frac{s\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a} \hat{\rho }_{i}\hat{\Pi }_{a}] +\frac{1}{2}w\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}] +\frac{1}{2}w^{*}\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}\hat{\Pi }_{a}]}{sQ(f\vert i)+\mathrm {Re}(w)P(f\vert i)}, \end{aligned}$$
(11)

where \(Q(f\vert i)=\sum _{a}\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a} \hat{\rho }_{i}\hat{\Pi }_{a}]\) and \(P(f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}]\) are the success probabilities of the postselection \(f\) with and without the measurement of the observable \(\hat{A}\) before performing the postselection. The former includes the effect of the back action of the measurement of \(\hat{A}\). Then the conditional average \(A_{fi}=\sum _{a}aP(a\vert f,i)\) of the observable \(\hat{A}\) is given by

$$\begin{aligned} A_{fi}=\frac{sA_{s}Q(f\vert i)+\mathrm {Re}(wA_{w})P(f\vert i)}{sQ(f\vert i)+\mathrm {Re}(w)P(f\vert i)}. \end{aligned}$$
(12)

In this equation, \(A_{s}\) is the strong value of the observable \(\hat{A}\) [9],

$$\begin{aligned} A_{s}=\frac{\sum _{a}a\mathrm {Tr}[\hat{M}_{f} \hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}]}{\sum _{a}\mathrm {Tr}[\hat{M}_{f} \hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}]}, \end{aligned}$$
(13)

and \(A_{w}\) is the weak value [11],

$$\begin{aligned} A_{w}=\frac{\mathrm {Tr}[\hat{M}_{f}\hat{A}\hat{\rho }_{i}]}{\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}]}. \end{aligned}$$
(14)

This result means that the conditional average \(A_{fi}\) is represented by a mixture of the strong value \(A_{s}\) and the weak value \(A_{w}\) of the measured observable \(\hat{A}\). In particular, we have

$$\begin{aligned} A_{fi}=\left\{ \begin{array}{l@{\quad }l} A_{s} &{} (s=1, w=0) \\ \mathrm {Re}A_{w} &{} (s=0,\,w=1) \\ \mathrm {Re}A_{w}\pm \mathrm {Im}A_{w} &{} (s=0,\,w=1\mp i) \end{array}\right. \end{aligned}$$
(15)

When the postselection of the system is not performed, the measurement operator \(\hat{M}_{f}\) is replaced with an identity operator, \(\hat{M}_{f}=\hat{1}\). In this case, both strong and weak values become equal to the usual average value of the observable \(\hat{A}\), that is, \(A_{s}=A_{w}=\mathrm {Tr}[\hat{A}\hat{\rho }_{i}]\equiv \langle \hat{A}\rangle \). This means that the conditional average \(A_{fi}\) is also equal to \(\langle \hat{A}\rangle \). Furthermore, using the success probability of the postselection \(P_{\mathrm {suc}}(f\vert i)=\sum _{a} P(a,f\vert i)\), we find that the average of \(A_{fi}\) over all the possible postselection is given by

$$\begin{aligned} \sum _{f}A_{fi}P_{\mathrm {suc}}(f\vert i)= & {} s\sum _{f}A_{s}Q(f\vert i)+\sum _{f}\mathrm {Re}(wA_{w})P(f\vert i) \nonumber \\= & {} [s+\mathrm {Re}(w)]\langle \hat{A}\rangle =\langle \hat{A}\rangle , \end{aligned}$$
(16)

where we have used Eq. (7).

Before proceeding further, we briefly note the properties of the strong value and the weak value. The strong value is expressed as \(A_{s}=\sum _{a}aQ_{s}(a \vert f,i)\) with

$$\begin{aligned} Q_{s}(a \vert f,i)=\frac{\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}]}{\sum _{a}\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}]}. \end{aligned}$$
(17)

It is easy to see that \(Q_{s}(a\vert f,i)\) satisfies the positivity \(Q_{s}(a\vert f,i)\ge 0\) and the normalization condition \(\sum _{a}Q_{s}(a\vert f,i)=1\). Hence the inequality \(a_{\max }\ge A_{s}\ge a_{\min }\) is always fulfilled, where \(a_{\max }\) and \(a_{\min }\) stand for the maximum and minimum eigenvalues of \(\hat{A}\). On the other hand, the real part of the weak value is given by \(\mathrm {Re}A_{w}=\sum _{a}a P_{r}(a\vert f,i)\) with

$$\begin{aligned} P_{r}(a\vert f,i)=\frac{\mathrm {Tr}[\hat{M}_{f} (\hat{\Pi }_{a}\hat{\rho }_{i}+\hat{\rho }_{i}\hat{\Pi }_{a})]}{2\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}]}, \end{aligned}$$
(18)

which satisfies the normalization condition \(\sum _{a}P_{r}(a\vert f,i)=1\). If the positivity \(P_{r}(a\vert f,i)\ge 0\) is fulfilled, we have the inequality \(a_{\max }\ge \mathrm {Re}A_{w}\ge a_{\min }\) for the weak value. However, it is known that the weak value can take a value beyond the spectral range of the observable \(\hat{A}\) [11]. This means that the conditional probability \(P_{r}(a\vert f,i)\) can take negative values. Setting \(s=0\) and \(w=1\) in Eq. (11), we obtain \(P(a\vert f,i)=P_{r}(a\vert f,i)\). The non-positivity of the joint quasi-probability is related to the anomaly of the weak value.

By making use of the equalities \(\hat{\Pi }_{a}^{2}=\hat{\Pi }_{a}\) and \(s+\mathrm {Re}w=1\), the joint quasi-probability \(P(a,f\vert i)\) given by Eq. (10) can be separated into positive and non-positive parts:

$$\begin{aligned} P(a,f\vert i)=P_{p}(a,f\vert i)+P_{n}(a,f\vert i), \end{aligned}$$
(19)

with

$$\begin{aligned} P_{p}(a,f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{W}_{p}(a)], \quad P_{n}(a,f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{W}_{n}(a)]. \end{aligned}$$
(20)

In this equation, the positive and non-positive Hermitian operators, \(\hat{W}_{p}(a)\) and \(\hat{W}_{n}(a)\), are given by

$$\begin{aligned} \hat{W}_{p}(a)= & {} \hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}, \end{aligned}$$
(21)
$$\begin{aligned} \hat{W}_{n}(a)= & {} \frac{1}{2}w\hat{\Pi }_{a}[\hat{\Pi }_{a},\hat{\rho }_{i}] +\frac{1}{2}w^{*}\mathrm {Tr}[\hat{\rho }_{i},\hat{\Pi }_{a}]\hat{\Pi }_{a}, \end{aligned}$$
(22)

which satisfy the relations,

$$\begin{aligned} \hat{\Pi }_{a}\hat{W}_{p}(a)\hat{\Pi }_{a}=\hat{W}_{p}(a), \quad \hat{\Pi }_{a}\hat{W}_{n}(a)\hat{\Pi }_{a}=0. \end{aligned}$$
(23)

This means that \(\hat{W}_{p}(a)\) is defined on the eigenspace of the eigenvalue \(a\) and \(\hat{W}_{n}(a)\) on the orthogonal subspace. If the strong measurement is performed, only the positive operator \(\hat{W}_{p}(a)\) appears. The joint quasi-probability \(P(a,f\vert i)\) can also be expressed as

$$\begin{aligned} P(a,f\vert i)= & {} \mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}] +\mathrm {Re}\{w\mathrm {Tr}(\hat{M}_{f}\hat{\Pi }_{a}[\hat{\Pi }_{a},\hat{\rho }_{i}])\} \nonumber \\= & {} \mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}] +\mathrm {Re}\{w\mathrm {Tr}([\hat{M}_{f},\hat{\Pi }_{a}]\hat{\Pi }_{a}\hat{\rho }_{i})\}, \end{aligned}$$
(24)

from which we find that if \([\hat{\Pi }_{a},\hat{\rho }_{i}]=0\) or \([\hat{M}_{f},\hat{\Pi }_{a}]=0\) is fulfilled, \(P(a,f\vert i)\) becomes positive. We can rewrite the joint quasi-probability \(P(a,f\vert i)\) into

$$\begin{aligned} R(a,f\vert i)= & {} s\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i} \hat{\Pi }_{a}] +P(f\vert i)\left\{ (\mathrm {Re}w)\mathrm {Tr}\left[ \hat{\Pi }_{a} \left( \frac{\hat{M}_{f}\hat{\rho }_{i}+\hat{\rho }_{i}\hat{M}_{f}}{2\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}]} \right) \right] \right. \nonumber \\&\left. +(\mathrm {Im}w)\mathrm {Tr}\left[ \hat{\Pi }_{a} \left( \frac{\hat{M}_{f}\hat{\rho }_{i}-\hat{\rho }_{i}\hat{M}_{f}}{2i \mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}]} \right) \right] \right\} \nonumber \\= & {} s\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i}\hat{\Pi }_{a}] +\mathrm {Re}\left\{ w\mathrm {Tr}\left[ \hat{\Pi }_{a}\frac{\hat{\rho }_{i}\hat{M}_{f}}{\mathrm {Tr}[\hat{\rho }_{i}\hat{M}_{f}]}\right] \right\} , \end{aligned}$$
(25)

where \(\hat{\rho }_{i}\hat{M}_{f}/\mathrm {Tr}[\hat{\rho }_{i}\hat{M}_{f}]\) is the transient density operator (the connection-state density operator) [23, 24, 36]. Furthermore, the joint quasi-probability \(P(a,f\vert i)\) can also be expressed as

$$\begin{aligned} P(a,f\vert i) =sP_{W}(a,f\vert i)+\mathrm {Re}[wP_{K}(a,f\vert i)], \end{aligned}$$
(26)

where \(P_{W}(a,f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{\Pi }_{a}\hat{\rho }_{i} \hat{\Pi }_{a}]\) is the Winger joint probability distribution [46] and \(P_{K}(a,f\vert i)=\mathrm {Tr}[\hat{M}_{f}\hat{\rho }_{i}\hat{\Pi }_{a}]\) is the Kirkwood–Dirac distribution [4648].

In the above consideration, we have ignored the time evolution of the system between the initial preparation and the postselection. It is easy to take the time evolution of the system into account in the conditional average. Here we assume that the initial preparation, the measurement and the postselection are, respectively, performed at time \(t_{i}\), \(t_{m}\) and \(t_{f}\), and we denote the time evolution operator of the system as \(\hat{U}(t)\). Then the conditional average value \(A_{fi}\) is given by Eq. (12) if we replace the strong value \(A_{s}\), the weak value \(A_{w}\) and the probabilities \(Q(f\vert i)\) and \(P(f\vert i)\) with

$$\begin{aligned} A_{s}= & {} \frac{\sum _{a}a\mathrm {Tr}[\hat{M}_{f}\hat{U}(t_{f}-t_{m}) \hat{\Pi }_{a}\hat{U}(t_{m}-t_{i})\hat{\rho }_{i} \hat{U}^{\dagger }(t_{m}-t_{i})\hat{\Pi }_{a}\hat{U}^{\dagger }(t_{f}-t_{m})]}{\sum _{a}\mathrm {Tr}[\hat{M}_{f}\hat{U}(t_{f}-t_{m}) \hat{\Pi }_{a}\hat{U}(t_{m}-t_{i})\hat{\rho }_{i} \hat{U}^{\dagger }(t_{m}-t_{i})\hat{\Pi }_{a}\hat{U}^{\dagger }(t_{f}-t_{m})]}, \end{aligned}$$
(27)
$$\begin{aligned} A_{w}= & {} \frac{\mathrm {Tr}[\hat{M}_{f}\hat{U}(t_{f}-t_{m}) \hat{A}\hat{U}(t_{m}-t_{i})\hat{\rho }_{i} \hat{U}^{\dagger }(t_{f}-t_{i})]}{\mathrm {Tr}[\hat{M}_{f} \hat{U}(t_{f}-t_{i})\hat{\rho }_{i} \hat{U}^{\dagger }(t_{f}-t_{i})]}, \end{aligned}$$
(28)
$$\begin{aligned} Q(f\vert i)= & {} \sum _{a}\mathrm {Tr}[\hat{M}_{f}\hat{U}(t_{f}-t_{m}) \hat{\Pi }_{a}\hat{U}(t_{m}-t_{i})\hat{\rho }_{i} \hat{U}^{\dagger }(t_{m}-t_{i})\hat{\Pi }_{a}\hat{U}^{\dagger }(t_{f}-t_{m})], \end{aligned}$$
(29)
$$\begin{aligned} P(f\vert i)= & {} \mathrm {Tr}[\hat{M}_{f} \hat{U}(t_{f}-t_{i})\hat{\rho }_{i} \hat{U}^{\dagger }(t_{f}-t_{i})]. \end{aligned}$$
(30)

Furthermore, if the quantum system is placed under the influence of a surrounding environment, the initial state \(\hat{\rho }_{i}\) should be replaced with an system-environmental initial system and the projector \(\hat{\Pi }_{a}\) and the measurement operator \(\hat{M}_{f}\) with \(\hat{\Pi }_{a}\otimes \hat{1}_{E}\) and \(\hat{M}_{f}\otimes \hat{1}_{E}\), where \(\hat{1}_{E}\) is an identity operator of the environment.

3 Measurement process for the average value \(A_{fi}\)

In this section, we provide a model of the quantum measurement which provides the conditional average \(A_{if}\) given by Eq. (12). Our model is based on the consideration in Ref. [37]. For the sake of simplicity, we assume that \(\hat{\Pi }_{a}\) is a one-dimensional eigenprojector of \(\hat{A}\), that is, \(\hat{\Pi }_{a}=\vert a\rangle \langle a\vert \), where \(\vert a\rangle \) is an eigenstate of \(\hat{A}\) such that \(\langle a\vert a'\rangle =\delta _{aa'}\) and \(\sum _{a}\vert a\rangle \langle a\vert =\hat{1}\). In this case, the strong value is given by \(A_{s} =Q^{-1}(f\vert i)\sum _{a}a\langle a\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a\rangle \) with \(Q(f\vert i)=\sum _{a}\langle a\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a\rangle \). In order to obtain the conditional average \(A_{fi}\), we need to simultaneously derive both \(A_{s}\) and \(A_{w}\) from the measurement outcomes. Hence we suppose that a measurement apparatus consists of two parts: one is for the strong value \(A_{s}\) and the other is for the weak value \(A_{w}\). We refer to the former (the latter) as \(s\)-part (\(w\)-part) of the apparatus. We denote the interaction Hamiltonian between the system and the \(s\)-part (\(w\)-part) of the apparatus as \(\hat{H}_{s}\) (\(\hat{H}_{w}\)). Then the total interaction Hamiltonian \(\hat{H}\) can be written in the following form:

$$\begin{aligned} \hat{H}=\hbar g_{s}\delta (t-t_{m})\hat{H}_{s} \otimes \vert s\rangle \langle s\vert +\hbar g_{w}\delta (t-t_{m})\hat{H}_{w} \otimes \vert w\rangle \langle w\vert , \end{aligned}$$
(31)

where \(t_{m}\) stands for a measurement time and \(\vert s\rangle \) and \(\vert w\rangle \) are orthonormal vectors that represent the \(s\)-part and \(w\)-part of the apparatus. The unitary operator that describes the state change caused by the interaction between the quantum system and the apparatus is given by

$$\begin{aligned} \hat{U}=\hat{U}_{s}\otimes \vert s\rangle \langle s\vert +\hat{U}_{w}\otimes \vert w\rangle \langle w\vert , \end{aligned}$$
(32)

with the \(s\)-part (\(w\)-part) unitary operator \(\hat{U}_{s,w}=\exp (-ig_{s,w}\hat{H}_{s,w})\). We assume that the unitary operators \(\hat{U}_{s}\) and \(\hat{U}_{w}\) are defined by

$$\begin{aligned} \hat{U}_{s}\vert a\rangle \otimes \vert \psi _{0}\rangle= & {} \vert a\rangle \otimes \vert \psi _{a}\rangle , \end{aligned}$$
(33)
$$\begin{aligned} \hat{U}_{w}\vert a\rangle \otimes \vert \phi _{0}\rangle= & {} \vert a\rangle \otimes \vert \phi _{a}\rangle , \end{aligned}$$
(34)

where \(\vert \psi _{0}\rangle \), \(\vert \psi _{a}\rangle \) and \(\vert \phi _{0}\rangle \), \(\vert \phi _{a}\rangle \) are normalized state vectors of the \(s\)-part and \(w\)-part of the apparatus. Furthermore we assume that the apparatus is initially prepared in the pure state,

$$\begin{aligned} \vert \Theta \rangle =\alpha \vert \psi _{0}\rangle \otimes \vert s\rangle +\beta \vert \phi _{0}\rangle \otimes \vert w\rangle , \end{aligned}$$
(35)

with \(\vert \alpha \vert ^{2}+\vert \beta \vert ^{2}=1\).

We drive the quantum state of the apparatus after the postselection of the system. First we expand the initial state of the system in terms of the eigenstates \(\vert a\rangle \) of \(\hat{A}\) as \(\hat{\rho }_{i}=\sum _{a,a'}\langle a\vert \hat{\rho }_{i}\vert a'\rangle \vert a\rangle \langle a'\vert \). Then we find that the interaction between the system and apparatus transforms the initial state \(\hat{\rho }_{i} \otimes \vert \Theta \rangle \langle \Theta \vert \) into

$$\begin{aligned} \hat{U}(\hat{\rho }_{i}\otimes \vert \Theta \rangle \langle \Theta \vert ) \hat{U}^{\dagger }= & {} \sum _{a,a'}\langle a\vert \hat{\rho }_{i}\vert a'\rangle (\alpha \vert a\rangle \otimes \vert \psi _{a}\rangle \otimes \vert s\rangle +\beta \vert a\rangle \otimes \vert \phi _{a}\rangle \otimes \vert w\rangle ) \nonumber \\&\times (\alpha ^{*}\langle a'\vert \otimes \langle \psi _{a'}\vert \otimes \langle s\vert +\beta ^{*}\langle a'\vert \otimes \langle \phi _{a'}\vert \otimes \langle w\vert ). \end{aligned}$$
(36)

Then after the postselection of the system, the quantum state of the apparatus becomes

$$\begin{aligned} \hat{\rho }_{A}^{fi}= & {} \frac{\mathrm {Tr}_{S}[(\hat{M}_{f}\otimes \hat{1}) \hat{U}(\hat{\rho }_{i}\otimes \vert \Theta \rangle \langle \Theta \vert )\hat{U}^{\dagger }]}{\mathrm {Tr}_{S}\mathrm {Tr}_{A} [(\hat{M}_{f}\otimes \hat{1}) \hat{U}(\hat{\rho }_{i}\otimes \vert \Theta \rangle \langle \Theta \vert )\hat{U}^{\dagger }]} \nonumber \\= & {} \frac{\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \left( \alpha \vert s\rangle \otimes \vert \psi _{a}\rangle +\beta \vert w\rangle \otimes \vert \phi _{a}\rangle \right) \left( \alpha ^{*}\langle s\vert \otimes \langle \psi _{a'}\vert +\beta ^{*}\langle w\vert \otimes \langle \phi _{a'}\vert \right) }{\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \left( \vert \alpha \vert ^{2}\langle \psi _{a'}\vert \psi _{a}\rangle +\vert \beta \vert ^{2}\langle \phi _{a'}\vert \phi _{a}\rangle \right) }, \end{aligned}$$
(37)

where \(\mathrm {Tr}_{S}\) and \(\mathrm {Tr}_{A}\) stand for trace operations over the Hilbert spaces of the system and the apparatus. The pointer observable of the apparatus is assumed to be

$$\begin{aligned} \hat{W}=\hat{X}\otimes \vert s\rangle \langle s\vert +\hat{Y}\otimes \vert w\rangle \langle w\vert , \end{aligned}$$
(38)

where \(\hat{X}\) and \(\hat{Y}\) are pointer observables of the \(s\)-part and \(w\)-part of the apparatus. Then we can obtain the average value of the pointer observable \(\hat{W}\),

$$\begin{aligned} \langle \hat{W}\rangle _{fi} =\mathrm {Tr}_{A}[\hat{W}\hat{\rho }_{A}^{fi}] =\frac{\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle [\vert \alpha \vert ^{2}\langle \psi _{a'}\vert \hat{X}\vert \psi _{a}\rangle +\vert \beta \vert ^{2}\langle \phi _{a'}\vert \hat{Y}\vert \phi _{a}\rangle ]}{\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \left[ \vert \alpha \vert ^{2}\langle \psi _{a'}\vert \psi _{a}\rangle +\vert \beta \vert ^{2}\langle \phi _{a'}\vert \phi _{a}\rangle \right] }. \end{aligned}$$
(39)

The success probability \(P_{\mathrm {suc}}(f\vert i)\) of the postselection is given by

$$\begin{aligned} P_{\mathrm {suc}}(f\vert i)=\sum _{a,a'} \langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \left[ \vert \alpha \vert ^{2}\langle \psi _{a'}\vert \psi _{a}\rangle +\vert \beta \vert ^{2}\langle \phi _{a'}\vert \phi _{a}\rangle \right] , \end{aligned}$$
(40)

in terms of which the average value of \(\langle \hat{W}\rangle _{fi}\) is calculated to be

$$\begin{aligned} \sum _{f}\langle \hat{W}\rangle _{fi}P_{\mathrm {suc}}(f\vert i) =\sum _{a}\langle a\vert \hat{\rho }_{f}\vert a\rangle \left[ \vert \alpha \vert ^{2}\langle \psi _{a}\vert \hat{X}\vert \psi _{a}\rangle +\vert \beta \vert ^{2}\langle \phi _{a}\vert \hat{Y}\vert \phi _{a}\rangle \right] . \end{aligned}$$
(41)

If we set \(\hat{X}=\hat{Y}\) and \(\vert \psi _{a}\rangle =\vert \phi _{a}\rangle \) in this equation, we obtain

$$\begin{aligned} \sum _{f}\langle \hat{W}\rangle _{fi}P_{\mathrm {suc}}(f\vert i) =\sum _{a}\langle a\vert \hat{\rho }_{i}\vert a\rangle f(a) =\langle f(\hat{A})\rangle , \end{aligned}$$
(42)

with \(f(a)=\langle \phi _{a}\vert \hat{X}\vert a\rangle \). This relation is equivalent to that discussed in Ref. [18], where the correspondence is given by \(\langle \hat{W}\rangle _{fi}\leftrightarrow \langle p\rangle _{\Omega _{1b}^{\succ }}\) and \(P_{\mathrm {suc}}(f\vert i)\leftrightarrow \fancyscript{P}_{1b}(\phi )\) with \(\langle p\rangle _{\Omega _{1b}^{\succ }}\) and \(\fancyscript{P}_{1b}(\phi )\) being defined in Ref. [18].

To show that the average value \(\langle \hat{W}\rangle _{fi}\) of the pointer observable is proportional to the conditional average \(A_{fi}\) of the system observable \(\hat{A}\), we assume that the apparatus state \(\vert \psi _{a}\rangle \) of the \(s\)-part is an eigenstate of the pointer observable \(\hat{X}\) such that \(\hat{X}\vert \psi _{a}\rangle =ga\vert \psi _{a}\rangle \) and \(\langle \psi _{a}\vert \psi _{a'}\rangle =\delta _{aa'}\) with some scaling parameter \(g\). Then we have

$$\begin{aligned}&\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \rho _{i}\vert a'\rangle \langle \psi _{a'}\vert \psi _{a}\rangle =\sum _{a}\langle a\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a\rangle =Q(f\vert i), \end{aligned}$$
(43)
$$\begin{aligned}&\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \psi _{a'}\vert \hat{X}\vert \psi _{a}\rangle = g\sum _{a}a\langle a\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a\rangle =gA_{s}Q(f\vert i), \end{aligned}$$
(44)

with the strong value \(A_{s}\) of the system observable \(\hat{A}\). For the weak value \(A_{w}\), we assume that the apparatus states \(\vert \phi _{a}\rangle \) of the \(w\)-part are almost indistinguishable such that \(\langle \phi _{a}\vert \phi _{a'}\rangle \approx 1\) and the matrix element of the pointer observable \(\hat{Y}\) is given by \(\langle \phi _{a'}\vert \hat{Y}\vert \phi _{a}\rangle =g[(1+i\xi )a'+(1-i\xi )a]/2\) with a real parameter \(\xi \) [37]. Then we obtain

$$\begin{aligned} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \phi _{a'}\vert \phi _{a}\rangle= & {} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle =P(f\vert i), \end{aligned}$$
(45)
$$\begin{aligned} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \phi _{a'}\vert \hat{Y}\vert \phi _{a}\rangle= & {} g\sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \left( \frac{1+i\xi }{2}a'+\frac{1-i\xi }{2}a\right) \nonumber \\= & {} g\left( \frac{1+i\xi }{2}A_{w}^{*} +\frac{1-i\xi }{2}A_{w}\right) P(f\vert i) \nonumber \\= & {} g\mathrm {Re}[(1-i\xi )A_{w}]P(f\vert i). \end{aligned}$$
(46)

Substituting Eqs. (43)–(46) into Eq. (39), we find that the average value \(\langle \hat{W}\rangle _{fi}\) of the pointer observable is given by

$$\begin{aligned} \langle \hat{W}\rangle _{fi} =g\frac{\vert \alpha \vert ^{2}A_{s}Q(f\vert i) +\vert \beta \vert ^{2}\mathrm {Re}[(1-i\xi )A_{w}]P(f\vert i)}{\vert \alpha \vert ^{2}Q(f\vert i) +\vert \beta \vert ^{2}P(f\vert i)}. \end{aligned}$$
(47)

When we set \(s=\vert \alpha \vert ^{2}\) and \(w=\vert \beta \vert ^{2}(1-i\xi )\) in this equation, the equalities \(\langle \hat{W}\rangle _{fi}=gA_{fi}\) and \(s+\mathrm {Re}w=1\) are derived. Therefore, we have shown that the conditional average \(A_{if}\) given by Eq. (12) can be obtained from the measurement outcomes.

As a simple example of the measurement process, we show that the conditional average of a qubit observable can be obtained by means of two controlled-NOT (CNOT) gates [5, 7]. First we suppose that a qubit observable to be measured is the Pauli matrix \(\hat{\sigma }_{z}=\vert 0\rangle \langle 0\vert -\vert 1\rangle \langle 1\vert \) with two orthonormal states \(\vert 0\rangle \) and \(\vert 1\rangle \) of the system. The strong value \(\Sigma _{s}\) and the weak value \(\Sigma _{w}\) of the Pauli matrix \(\hat{\sigma }_{z}\) are given, respectively, by

$$\begin{aligned} \Sigma _{s}= & {} \frac{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle -\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle +\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }, \end{aligned}$$
(48)
$$\begin{aligned} \Sigma _{w}= & {} \frac{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle -\langle 0\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 0\rangle +\langle 1\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 1\rangle -\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle +\langle 0\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 0\rangle +\langle 1\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 1\rangle +\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }. \end{aligned}$$
(49)

The real and imaginary parts of the weak value \(\Sigma _{w}\) are

$$\begin{aligned} \mathrm {Re}\Sigma _{w}= & {} \frac{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle -\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle +\langle 0\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 0\rangle +\langle 1\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 1\rangle +\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }, \end{aligned}$$
(50)
$$\begin{aligned} \mathrm {Im}\Sigma _{w}= & {} i\frac{\langle 0\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 0\rangle -\langle 1\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 1\rangle }{\langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle +\langle 0\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 0\rangle +\langle 1\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 1\rangle +\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle }. \end{aligned}$$
(51)

The unitary transformation \(\hat{U}_{\mathrm {CNOT}}\) of the CNOT gate is defined by \(\hat{U}_{\mathrm {CNOT}}\vert 0\rangle \otimes \vert \psi _{\gamma }\rangle =\vert 0\rangle \otimes \vert \psi _{\gamma }\rangle \) and \(\hat{U}_{\mathrm {CNOT}}\vert 1\rangle \otimes \vert \psi _{\gamma }\rangle =\vert 1\rangle \otimes \vert \psi _{\bar{\gamma }}\rangle \), where \(\vert \psi _{\gamma }\rangle =\gamma \vert 0\rangle +\bar{\gamma }\vert 1\rangle \) and \(\vert \psi _{\bar{\gamma }}\rangle =\bar{\gamma }\vert 0\rangle +\gamma \vert 1\rangle \) with \(\vert \gamma \vert ^{2}+\vert \bar{\gamma }\vert ^{2}=1\) are apparatus states. The left of the tensor product is a control bit and the right is a target bit.

To measure the strong value \(\Sigma _{s}\), we prepare the \(s\)-part of the apparatus in the initial state \(\vert \psi _{0}\rangle =\vert 0\rangle \) and apply the CNOT gate, where a control bit is the quantum system and a target bit is the apparatus. Here we have \(\hat{U}_{\mathrm {CNOT}}\vert a\rangle \otimes \vert 0\rangle =\vert a\rangle \otimes \vert a\rangle \) with \(a=0,1\). When the pointer observable of the \(s\)-part is given by \(\hat{X}=\hat{\sigma }_{z}\), we obtain the matrix element \(\langle \psi _{a'}\vert \hat{X}\vert \psi _{a}\rangle =(-1)^{a}\delta _{aa'}\). Thus we can derive the relations,

$$\begin{aligned} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \psi _{a'}\vert \psi _{a}\rangle= & {} \langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle +\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle =Q(f\vert i), \end{aligned}$$
(52)
$$\begin{aligned} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \psi _{a'}\vert \hat{X}\vert \psi _{a}\rangle= & {} \langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle -\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle =\Sigma _{s}Q(f\vert i). \end{aligned}$$
(53)

For the measurement of the weak value \(\Sigma _{w}\), we first prepare the \(w\)-part of the apparatus in the state \(\vert \phi _{0}\rangle =\sqrt{1/2+\epsilon }\vert 0\rangle +\sqrt{1/2-\epsilon }\vert 1\rangle \approx [(1+\epsilon )\vert 0\rangle +(1-\epsilon )\vert 1\rangle ]/\sqrt{2}\), where \(\epsilon \) is an infinitesimal parameter so that we can ignore all the terms higher than the first order of \(\epsilon \). Next we apply the CNOT gate, where the quantum system is a control bit and the apparatus is a target bit. Then, up to the first order, we obtain

$$\begin{aligned} \hat{U}_{\mathrm {CNOT}}\vert 0\rangle \otimes \vert \phi _{0}\rangle= & {} \vert 0\rangle \otimes \left( \frac{1+\epsilon }{\sqrt{2}}\vert 0\rangle +\frac{1-\epsilon }{\sqrt{2}}\vert 1\rangle \right) =\vert 0\rangle \otimes \vert \phi _{0}\rangle , \end{aligned}$$
(54)
$$\begin{aligned} \hat{U}_{\mathrm {CNOT}}\vert 1\rangle \otimes \vert \phi _{0}\rangle= & {} \vert 1\rangle \otimes \left( \frac{1-\epsilon }{\sqrt{2}}\vert 0\rangle +\frac{1+\epsilon }{\sqrt{2}}\vert 1\rangle \right) =\vert 1\rangle \otimes \vert \phi _{1}\rangle , \end{aligned}$$
(55)

where we note that \(\langle \phi _{0}\vert \phi _{1}\rangle \approx 1+O(\epsilon ^{2})\). To obtain the weak value, we introduce the pointer observable of the \(w\)-part by \(\hat{Y}=(\hat{\sigma }_{z}-\xi \hat{\sigma }_{y})/2\epsilon \) with \(\hat{\sigma }_{y}=-i(\vert 0\rangle \langle 1\vert -\vert 1\rangle \langle 0\vert )\). Note that the pointer observable \(\hat{Y}\) can be expressed as \(\hat{Y}=Y_{0}(\hat{\sigma }_{z}\cos \theta -\hat{\sigma }_{y}\sin \theta )\) with \(Y_{0}=\sqrt{1+\xi ^{2}}/2\epsilon \) and \(\tan \theta =\xi \). Since we have the relations \(\langle \phi _{a'}\vert \hat{\sigma }_{z}\vert \phi _{a}\rangle =2\epsilon (-1)^{a}\delta _{aa'}\) and \(\langle \phi _{a'}\vert \hat{\sigma }_{y}\vert \phi _{a}\rangle =2i\epsilon (-1)^{a}(1-\delta _{aa'})\) up to the first order of \(\epsilon \), the matrix element is given by \(\langle \phi _{a'}\vert \hat{Y}\vert \phi _{a}\rangle =(-1)^{a}[\delta _{aa'}-i\xi (1-\delta _{aa'})]\). Using Eqs. (50) and (51), we can derive

$$\begin{aligned} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \phi _{a'}\vert \phi _{a}\rangle= & {} \sum _{a,a}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle =P(f\vert i), \end{aligned}$$
(56)
$$\begin{aligned} \sum _{a,a'}\langle a'\vert \hat{M}_{f}\vert a\rangle \langle a\vert \hat{\rho }_{i}\vert a'\rangle \langle \phi _{a'}\vert \hat{Y}\vert \phi _{a}\rangle= & {} \langle 0\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 0\rangle -\langle 1\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 1\rangle \nonumber \\&-i\xi [\langle 1\vert \hat{M}_{f}\vert 0\rangle \langle 0\vert \hat{\rho }_{i}\vert 1\rangle -\langle 0\vert \hat{M}_{f}\vert 1\rangle \langle 1\vert \hat{\rho }_{i}\vert 0\rangle ] \nonumber \\= & {} \left[ \mathrm {Re}\Sigma _{w}+\xi \mathrm {Im}\Sigma _{w}\right] P(f\vert i) =\mathrm {Re}[(1-i\xi )\Sigma _{w}]P(f\vert i). \end{aligned}$$
(57)

For the pointer observable \(\hat{W}=\hat{\sigma }_{z}\otimes \vert s\rangle \langle s\vert +Y_{0}(\hat{\sigma }_{z}\cos \theta -\hat{\sigma }_{y}\sin \theta )\otimes \vert w\rangle \langle w\vert \) of the apparatus, the average value is given by

$$\begin{aligned} \langle \hat{W}\rangle _{fi}=g\frac{s\Sigma _{s}Q(f\vert i) +\mathrm {Re}(w\Sigma _{s})P(f\vert i)}{sQ(f\vert i)+wP(f\vert i)} =gA_{fi} \quad (\hat{A}=\hat{\sigma }_{z}). \end{aligned}$$
(58)

Therefore, we have shown how to obtain the conditional average of the qubit observable \(\hat{\sigma }_{z}\).

The measurement process that we have considered above is implemented by a system shown in Fig. 1. In this process, we first prepare an apparatus qubit on a path \(\vert s\rangle \), the internal state of which is given by \(\vert 0\rangle \). The initial state of the apparatus is \(\vert \Theta \rangle =\vert 0\rangle \otimes \vert s\rangle \). Next we enter the apparatus qubit into a lossless beam splitter, the output state of which is given by \(\vert \Theta '\rangle =\vert 0\rangle \otimes (\alpha \vert s\rangle +\beta \vert w\rangle )\), where \(\vert \alpha \vert ^{2}\) and \(\vert \beta \vert ^{2}\) are the reflectance and the transmittance of the beam splitter [7]. After that, we apply the unitary transformation \(\hat{V}_{\epsilon }\) to the apparatus qubit on the path \(\vert w\rangle \),

$$\begin{aligned} \hat{V}_{\epsilon }=\sqrt{\frac{1}{2}+\epsilon }\vert 0\rangle \langle 0\vert +\sqrt{\frac{1}{2}-\epsilon }\vert 1\rangle \langle 0\vert +\sqrt{\frac{1}{2}-\epsilon }\vert 0\rangle \langle 1\vert -\sqrt{\frac{1}{2}+\epsilon }\vert 1\rangle \langle 1\vert \quad (\epsilon \ll 1), \end{aligned}$$
(59)

which is nearly equal to \(\hat{H}+\epsilon (\hat{\sigma }_{z} -\hat{\sigma }_{x})/\sqrt{2}\) with the Hadamard transformation \(\hat{H}\) [5, 7]. Then, up to the first order with respect to the infinitesimal parameter \(\epsilon \), the state of the apparatus becomes

$$\begin{aligned} \vert \Theta ''\rangle =\alpha \vert 0\rangle \otimes \vert s\rangle +\beta \left( \frac{1+\epsilon }{\sqrt{2}}\vert 0\rangle +\frac{1-\epsilon }{\sqrt{2}}\vert 1\rangle \right) \otimes \vert w\rangle . \end{aligned}$$
(60)

We apply the two CNOT gates to the system qubit and the apparatus qubit on the path \(\vert s\rangle \), \(\vert w\rangle \). Finally, we perform the postselection \(\hat{M}_{f}\) of the system qubit and observe the pointer observables \(\hat{X}=\hat{\sigma }_{z}\) on the path \(\vert s\rangle \) and \(\hat{Y}=(\hat{\sigma }_{z}-\xi \hat{\sigma }_{y})/2\epsilon \) on the path \(\vert w\rangle \) of the apparatus. The measurement outcomes yield the conditional average \(A_{fi}\) of the system observable \(\hat{A}=\hat{\sigma }_{z}\).

Fig. 1
figure 1

A schematic representation of the measurement process for obtaining the conditional average \(A_{fi}\) of a qubit observable, where \(\hat{A}=\hat{\sigma }_{z}\), \(\hat{X}=\hat{\sigma }_{z}\) and \(\hat{Y}=Y_{0}(\hat{\sigma }_{z}\cos \theta -\hat{\sigma }_{y}\sin \theta )\).

Although we have considered the conditional average of the qubit observable \(\hat{\sigma }_{z}\), it is easy to generalize the measurement process for a general qubit observable \(\hat{\sigma }_{n}=\sum _{j=x,y,z}n_{j}\hat{\sigma }_{j}\) with \(\sum _{j=x,y,z}n_{j}^{2}=1\). To obtain the conditional average of \(\hat{\sigma }_{n}\), all that we have to do is to replace the states \(\vert 0\rangle \) and \(\vert 1\rangle \) with the eigenstates \(\vert 0_{n}\rangle \) and \(\vert 1_{n}\rangle \) of of the observable \(\hat{\sigma }_{n}\). For instance, we use \(\hat{\sigma }_{z}^{n} =\vert 0_{n}\rangle \langle 0_{n}\vert -\vert 1_{n}\rangle \langle 1_{n}\vert \) and \(\hat{\sigma }_{y}^{n} =-i(\vert 0_{n}\rangle \langle 1_{n}\vert -\vert 1_{n}\rangle \langle 0_{n}\vert )\) instead of \(\hat{\sigma }_{z}\) and \(\hat{\sigma }_{y}\), and the CNOT gate \(\hat{U}_{\mathrm {CNOT}}\) is redefined by \(\hat{U}_{\mathrm {CNOT}}\vert 0_{n}\rangle \otimes \vert 0_{n}\rangle =\vert 0_{n}\rangle \otimes \vert 0_{n}\rangle \), \(\hat{U}_{\mathrm {CNOT}}\vert 0_{n}\rangle \otimes \vert 1_{n}\rangle =\vert 0_{n}\rangle \otimes \vert 1_{n}\rangle \), \(\hat{U}_{\mathrm {CNOT}}\vert 1_{n}\rangle \otimes \vert 0_{n}\rangle =\vert 1_{n}\rangle \otimes \vert 1_{n}\rangle \) and \(\hat{U}_{\mathrm {CNOT}}\vert 1_{n}\rangle \otimes \vert 1_{n}\rangle =\vert 1_{n}\rangle \otimes \vert 0_{n}\rangle \). Then the measurement system shown in Fig. 1 with the replacement \(\vert 0\rangle \) and \(\vert 1\rangle \) with \(\vert 0_{n}\rangle \) and \(\vert 1_{n}\rangle \) provides the conditional average of \(\hat{\sigma }_{n}\).

4 Summary

In this paper, we have considered the conditional average of an observable which is measured between the initial preparation and the postselection of the system. Assuming that the linear map describing the effect of the measurement on the system can be expressed only in terms of eigenprojectors of a measured observable, we have found that the conditional average is represented by the mixture of the strong value and the weak value of the measured observable. The joint quasi-probability of the measurement outcome and the postselection is not positive due to the anomaly of the weak value. Furthermore, we have shown the measurement process which derives the conditional average from the measurement outcomes. The apparatus consists of two parts: one is for the strong value and the other is for the weak value. In particular, when the measured system is a qubit, the measurement process can be implemented by means of a lossless beam splitter, a single-qubit unitary transformation and two controlled NOT gates, where the pointer observables are the Pauli operators.