CTRW modeling of quantum measurement and fractional equations of quantum stochastic filtering and control

Initially developed in the framework of quantum stochastic calculus, the main equations of quantum stochastic filtering were later on derived as the limits of Markov models of discrete measurements under appropriate scaling. In many branches of modern physics it became popular to extend random walk modeling to the continuous time random walk (CTRW) modeling, where the time between discrete events is taken to be non-exponential. In the present paper we apply the CTRW modeling to the continuous quantum measurements yielding the new fractional in time evolution equations of quantum filtering and thus new fractional equations of quantum mechanics of open systems. The related quantum control problems and games turn out to be described by the fractional Hamilton-Jacobi-Bellman (HJB) equations on Riemannian manifolds. By-passing we provide a full derivation of the standard quantum filtering equations, in a modified way as compared with existing texts, which (i) provides explicit rates of convergence (that are not available via the tightness of martingales approach developed previously) and (ii) allows for the direct applications of the basic results of CTRWs to deduce the final fractional filtering equations.


Introduction
Direct continuous observations are known to destroy quantum evolutions (so-called quantum Zeno paradox), so that continuous quantum measurements have to be indirect, and the results of the observation are assessed via quantum filtering. Initially developed in the framework of quantum stochastic calculus by Belavkin in the 80s of the last century in [6][7][8], see [12] for a readable modern account, the main equations of quantum stochastic filtering, often referred to as the Belavkin equations, were later on derived via more elementary approach, as the limit of standard discrete measurements under appropriate scaling, see e.g. [9,10,39]. The scaling arises from the basic Markovian assumption that the times between measurement are either fixed or exponentially distributed, like in a standard random walk. Since such Markovian assumption has no a priori justification, in many branches of modern physics it became popular to extend random walk modeling to the continuous time random walk (CTRW) modeling, where the time between discrete events is taken to be non-exponential, usually from the domain of attraction of a stable law. In the present paper we apply the CTRW modeling to the continuous quantum measurements yielding the new fractional in time evolution equations of quantum filtering in the scaling limit. The related quantum control problems turn out to be described by the fractional Hamilton-Jacobi-Bellman (HJB) equations on Riemannian manifolds (complex projective spaces in the case of finite-dimensional quantum mechanics) or the fractional Isaacs equation in the case of competitive control. By-passing we provide a full derivation of the standard quantum filtering equations (explaining from scratch all underlying quantum mechanical rules used) in a slightly modified and simplified way yielding also new explicit rates of convergence (which are not available via the tightness of martingales approach developed previously) and tailored in a way that allows for the direct applications of the basic results of CTRWs to deduce the final fractional filtering equations.
Several general comments on a wider context are in order.
(i) The fractional equations of quantum stochastic filtering derived here can be considered as an alternative formulation of fractional quantum mechanics, which is different from the framework of fractional Schrödinger equations suggested in [31] and extensively studied recently. This leads also to a different class of quantum control problems, as those related to fractional Schrödinger formulation, as discussed e.g. in [45]. (ii) The fractional versions of the classical stochastic filtering (see [2] for the basics) has been actively studied recently, see e.g. [44]. (iii) The quantum mean-field games as developed by the author in [25] can now be extended to the theory of fractional quantum mean-field games. The classical versions of fractional mean-field games just started to appear in the literature, see [13]. On the other hand, the application of classical stochastic filtering in the study of mean-field games has also started to appear, see [42]. (iv) Fractional modeling and CTRW become very popular in almost all domains of physics, as well as economics and finances, see e.g. [3,36,43,46] for some representative references.
The contents of the paper is as follows. In Section 2 we recall the basic notions and notations of finite-dimensional quantum mechanics, and in Section 3 we introduce the Markov chain of sequential indirect quantum measurements, which is the standard starting point for dealing with continuous measurements. In Sections 4 and 5 we derive the main quantum filtering equations in the cases of so-called counting and diffusive observations. As was already mentioned, though the derivation of the filtering equations from the approximating Markov chain is well known by now (see e.g. [38]) our approach is new and yields explicit rates of convergence. In Section 6 the limiting equation is derived in a general case of mixed counting and diffusive observations via a multichannel measuring device. This preparatory work allows us to derive our main results, fractional equations of quantum filtering and control, in a more or less straightforward way, by applying the established techniques of CTRW to the setting of the Markov chains of sequential quantum measurements, as developed in Sections 4 -6. This is done in Sections 7 and 8. In Section 9 we briefly describe a slightly different Markov chain approximation to continuous measurement that can be used to derive filtering equations in certain cases of unbounded operators involved. In Appendices A,B,C several (known) probabilistic techniques are presented in a concise form tailored to our purposes. They are used in the main body of the paper.
Some basic notations to be used throughout the text are as follows.
For two Banach spaces B and D equipped with norms . B and . D respectively, let us denote by L(D, B) the Banach space of bounded linear operators D → B equipped with the usual operator norm . D→B . We shall also write L(B) for L(B, B).
The scalar product of operators in a Hilbert space is given by the trace: (R, S) = tr(RS).
For K = R d or a convex closed subset of R d we denote C(K ) the Banach space of continuous bounded functions on K , equipped with the sup-norm and C k (K ) the Banach space of k times continuously differentiable functions on K (with the derivatives at the boundary understood as the continuous extensions of the derivatives in the inner points), with the norm being the sum of the sup-norms of the functions and all their partial derivatives of order not exceeding k.

Notations for quantum states and tensor products
Recall that a general isolated quantum system is described by a Hilbert space H and a self-adjoint operator H in it, the Hamiltonian. The pure states of the system are unit vectors in H and the general mixed states are density matrices, that is, nonnegative operators in H with unit trace. Let us denote S(H ) the set of all such mixed states in H . To a pure state there corresponds a density matrix according to the rule ψ → γ = ψ ⊗ψ, also denoted in Dirac's notation as |ψ ψ|. This density matrix is the one-dimensional orthogonal projector on the line generated by ψ.
Or equivalently, if X ∈ H 0 ⊗ H 1 has coordinates X k j in the basis {e k ⊗ f j }, the vector AX has the coordinates m,l A k j ml X ml in this basis. A product A ⊗ B of two operators A and B acting in H 0 and H 1 respectively is defined by its action on tensor products as In the coordinate description A ⊗ B has the matrix elements expressed as A i 1 j 1 B i 2 j 2 in terms of the matrix elements of A and B.
An operator A in H 0 has the natural lifting A ⊗ I (where I is the unit operator) to The key notion of the theory of interacting systems is that of the partial trace. For an operator A in H 0 ⊗ H 1 the partial trace with respect to the second system is the operator tr p1 A in H 0 given by the matrix This partial trace is interpreted as the state of the first system given the state of the coupled one. Therefore it can be looked at as the quantum analog of the notion of marginal distribution of classical probability. Similarly, the partial trace with respect to the first system is the operator tr p0 A in H 1 given by the matrix Clearly, tr(tr p0 A) = tr(tr p1 A) = tr(A).
In a two-dimensional Hilbert spaces C 2 one usually chooses the standard basis e 0 = (1, 0), e 1 = (0, 1), and represents the Hilbert product space H 0 ⊗ C 2 by the natural decomposition

V. Kolokoltsov
Every operator A in this space has the block decomposition where the operators A i→ j act from H 0i to H 0 j , i, j = 0, 1. The trace (2.1) gets the expression In particular, we shall use the following block representations: To conclude this section, let us write down the simple small time asymptotic formula for the evolutions e −it H that we shall use repeatedly. Namely, up to the terms of order higher than t 2 in small t, we have

The starting point: Markov chains of sequential indirect observations
Here we describe the Markov chains of sequential indirect observations (rather standard by now, at least after paper [1]) in discrete and continuous time recalling first quickly the main notions related to quantum measurements. Physical observables are given by self-adjoint operators A in H. If A has a discrete spectrum (which is always the case in finite-dimensional H, that we shall mostly work with), then A has the spectral decomposition A = j λ j P j , where P j are orthogonal projections on the eigenspaces of A corresponding to the eigenvalues λ j . According to the basic postulate of quantum measurement, measuring observable A in a state γ (often referred to as the Stern-Gerlach experiment) can yield each of the eigenvalue λ j with the probability tr (γ P j ) = tr (P j γ P j ), (3.1) and, if the value λ j was obtained, the state of the system changes (instantaneously) to the reduced state In particular, if the state ρ was pure, γ = |ψ ψ|, then the probability to get λ j as the result of the measurement becomes (ψ , P j ψ) and the reduced state also remains pure and is given by the vector P j ψ. If the interaction with the apparatus was preformed 'without reading the results', the state ρ is said to be subject to a non-selective measurement that changes γ to the state j P j ρ P j . Indirect measurements of a chosen quantum system in the initial space H 0 , which we shall often referred to as an atom, are organised in the following way. One couples the atom with another quantum system, a measuring devise, specified by another Hilbert space H. Namely the combined system lives in the tensor product Hilbert space H 0 × H and its evolution is given by certain self-adjoint operator H in H 0 × H. In the measuring device some fixed vector ϕ ∈ H is chosen, called the vacuum and interpreted as the stationary state of the devise when no interaction is involved. The corresponding density matrix will be denoted Ω = |ϕ ϕ|. Indirect measurements of the states of the atom are performed by measuring the coupled system via an observable of the second system and then projecting the resulting state to the atom via the partial trace.
Namely, it is described by an operator R in H with the spectral decomposition R = j λ j P j and is performed in two steps: given a state γ in H 0 × H one performs a measurement of R lifted as I ⊗ R to H 0 × H yielding values λ j and new states (I ⊗ P j )γ (I ⊗ P j )/tr (γ (I ⊗ P j )) with probabilities p j = tr (γ (I ⊗ P j )), and then one projects these states to H 0 via the partial trace producing the states tr p1 [(I ⊗ P j )γ (I ⊗ P j )/tr (γ (I ⊗ P j ))]. (3. 2) The discrete time with the probabilities Then the same repeats starting with ρ t as the initial state. Let us denote U t the transition operator of this Markov chain that acts on the set of continuous functions on S(H ) as on C(H (S)) evolving according to the same rules, with only difference that the times t between successive measurements are not fixed, but represent exponential random variables τ with some fixed intensity λ: P(τ > t) = e −λt . The generator L λ of this Markov process is bounded in C(S(H )) and acts as All "quantum content" of the theory is now captured in the explicit formula (3.3). What follows will be the pure classical probability analysis of these Markov chains, their scaling limits and control. In this paper we shall work with the measuring devises of the simplest form living in two-dimensional Hilbert spaces C 2 or more generally the tensor products of these spaces. Choosing the standard basis e 0 = (1, 0), e 1 = (0, 1), we shall use the decomposition and we shall choose the vacuum vector ϕ = e 0 , so that

Belavkin equations for a counting observation
For simplicity we shall work exclusively with finite-dimensional Hilbert spaces H 0 = C n , making occasionally some comments about more general case. The set of states S(C n ) is a compact convex set in the Euclidean space R n 2 , the space of complex Hermitian n × n matrices. Let us choose an arbitrary self-adjoint operator in H 0 ⊗ C 2 given by its matrix representation We are aiming at calculating the small time asymptotics of the Markov transition operators defined by (3.3).
The main idea for obtaining sensible asymptotic limits suggests enhancing the interaction part C of H by replacing it with the scaled version C/ √ t. Thus we choose the Hamiltonian in the form

Remark 1
The idea of the scaling comes from the analysis of the so-called quantum Zeno paradox. Its essence is a rather simple observation that if one performs repeated measurements with reduction (3.1) and pass to the limit, as time between measurements tends to zero, then the state effectively remains in the initial state all the time irrespectively of the dynamics. This effect is also referred to as the watch dog effect. Therefore the only way to get a sensible dynamics that takes into account both dynamics and observation is to enhance the interaction part of the dynamics to make its effect comparable with that of the repeated reduction (3.1). Thus one can suggest scaling C as C/t α with some α > 0. As calculations show (one can repeat the calculations below with an arbitrary α) only with α = 1/2 a sensible limit is obtained.
By the second equation in (2.3), we get where {C, D} = C D + DC denotes the anti-commutator. Using (2.5), and keeping terms of order not exceeding t we get the approximation which is the key formula for what follows.
As it turns out, the limiting processes are of two types, depending on whether the projectors P 0 and P 1 of the spectral decomposition of R are diagonal, that is or otherwise. Let us start with the case of projectors (4.2). We have Hence the non-normalized new states arẽ occurring with the probabilities Aiming at using Proposition 1 (ii) we are looking for the limit of the operator Denoting T = tr(C * Cρ) we can write up to terms of order t that which equals approximately to Summarising by looking carefully at the small terms ignored, we can conclude the following.

Lemma 1 Under the setting considered,
for f ∈ C 2 (S(H 0 )), with L count given by (4.3) and a constant κ.
We can prove now our first result.
Theorem 1 Let H 0 = C n and A, C be n × n square matrices with A being Hermitian. Then: (i) The operator (4.3) generates a Feller process O ρ t in S(H 0 ) and the corresponding Feller semigroup T t in C(S(H 0 )) having the spaces C 1 (S(H 0 )) and C 2 (S(H 0 )) as invariant cores, and T s are bounded in these spaces uniformly for s ∈ [0, t] with any t > 0.
so that the corresponding processes converge in distribution, with the following rates of convergence: where the constant κ(t) depends on the dimension n and the norms of A and C. (iii) The scaled semigroups T λ s converge to the semigroup T s , as λ → 0, so that the corresponding processes converge in distribution, with the following rates of convergence: Proof (i) This is a consequence of Proposition 3. To make this conclusion one needs to show property (11.3) with K = S(C n ) and It is straightforward to see that the solutions to the ODEρ = b(ρ) preserve the affine set of Hermitian matrices with unit trace. So the key point is the preservation of positivity. It turns out that a stronger version of (11.3) holds, namely that d(ρ + hb(ρ), K ) = 0 for any ρ from the boundary of K and all sufficiently small h. By the compactness of a unit ball in C n , this claim follows from the following one. If ρ belongs to the boundary of K , that is, there exists a nonempty set (ii) This is a consequence of (i), Proposition 1 (ii) and the observation that (10.5) holds here with the triple of spaces (iii) This is a consequence of (i), formula (3.6) and Proposition 1 (i), with B = C(S(H 0 )), D = C 2 (S(H 0 )).

Remark 2
This result extends almost automatically to the case of an arbitrary separable Hilbert space H 0 and arbitrary bounded operators H , C, with the derivatives understood in the Fréchet sense. The only point where the finite-dimensional setting was used was in proving statement (i) using compactness of a unit ball in C n and the Brezis theorem. In infinite-dimensional case one can use the compactness of a unit ball in a Hilbert space in the weak topology and the Banach-space version of the Brezis theorem, as presented in [32] and [30].
As is seen directly via Ito's formula, the Feller process O ρ t generated by (4.3) can be described as solving the jump type SDE with the counting process N t with the position dependent intensity tr(C * Cρ), so that the compensated process N t − t 0 tr(C * Cρ s ) ds is a martingale. Equation (4.7) is the Belavkin quantum filtering SDE corresponding to the counting type observation (because the driving process N t is a counting process). Representation via the generator is an equivalent way of specifying the process of continuous quantum observation and filtering.

Remark 3 Equation (4.7)
is slightly nonstandard as the driving noise N t is itself position dependent. However there is a natural way to rewrite it in terms of an independent driving noise. Namely, with a standard Poisson random measure process N (dx dt) on R + × R + (with Lebesgue measure as intensity) one can rewrite equation (4.7) in the following equivalent form: see details of this construction in [38]. Alternatively, one can make sense of (4.7) in terms of the general theory of weak SDEs from [20].

Remark 4
The meaning of the term 'counting observation' (as well as 'diffusive type' of the next section) becomes more concrete in a more advanced treatment of the process of quantum measurement, see e.g. [12].
General couple of two orthogonal projectors in C 2 is easily seen to be of the form The phase terms with ψ does not make much difference, so we choose further ψ = 0. Moreover, to avoid diagonal case we assume φ = π k/2, k ∈ N .
By (2.4), Hence, for arbitrary matrices a, b, c, d, we have Since P 1 is obtained from P 0 by changing φ to φ + π/2, it follows that To get new states we have to take a, b, c, d from (4.1). Hence for the non-normalized states we get the approximate formulas (up to terms of order t): These states occur with the probabilities For arbitrary numbers a, b, c, one can write up to terms of order t, that Consequently, with this order of approximation, and therefore the normalized states are given by the formulas with Ω = tr(ρC * + Cρ) and and The terms of order t in p j give contributions of lower order, so that to the main order in small h we have where, for a matrix A, The terms of order h −1/2 cancel and we get in the main term with which is remarkably independent of φ! Thus, taking into account the terms that were ignored within the approximation, we obtained the following counterpart of Lemma 1:

Lemma 2
Under the setting considered, and for any φ = π k/2, k ∈ Z, with L di f given by (5.1).
Unlike the jump-type limiting processes analysed in the previous section, where a straightforward pure analytic proof of the well-posedness of the process generated by L is available, here an approach using SDEs is handy. Ito's formula shows that a process generated by (5.1) can arise from solving the following Ito's SDE: where W t is a standard one-dimensional Wiener process. This SDE is the Belavkin quantum filtering SDE for normalized states corresponding to the diffusive type observation.
Theorem 2 Let H 0 = C n and A, C be n × n square matrices with A being Hermitian. Then: (i) The operator (5.1) generates a Feller process O ρ t in S(H 0 ) and the corresponding Feller semigroup T t in C(S(H 0 )) having the spaces C 2 (S(H 0 )) and C 3 (S(H 0 )) as invariant cores, and T s are bounded in these spaces uniformly for s ∈ [0, t] with any t > 0. This process is given by the solutions to SDE (5.3), which is well posed as a diffusion equation in S(H 0 ).
so that the corresponding processes converge in distribution, with the following rates of convergence:

4)
where the constant κ(t) depends on the norms of A and C.
(iii) The scaled semigroups T λ s converge to the semigroup T s , as λ → 0, so that the corresponding processes converge in distribution, with the following rates of convergence: Proof Parts (ii) and (iii) are obtained by the same arguments as in the proof of Theorem 1. One only has to mention that estimate (10.3) needed to apply Proposition 1 follows from the standard fact of the theory of diffusion that E( Using the fundamental result of the Stratonovich integral, stating that solutions to Stratonovich SDEs can be obtained as the limits of the solutions to the ODEs obtained by approximating the white noise with smooth functions, we can state that the solutions to this Stratonovich equation preserve positivity of matrices, if the equationṡ preserve the set of positive matrices for any continuous function φ t . But this follows by the Brezis Theorem 6. To see this we substitute the expression for B(ρ) in the first three places of the last square bracket yielding the equatioṅ (the key point is that the 'nasty' term CρC * cancels). It is seen that Theorem 6 applies, because whenever (v, ρv) = 0, the r.h.s. ω t (ρ) of equation (5.7) satisfies (v, ω t (ρ)v) = 0 for any function φ t . The details of the argument are the same as in the proof of Theorem 1.
Remark 5 Yet another way to prove the preservation of positivity can be carried out via the theory of boundary points. Namely, from Proposition 6.4.1 in [21] it follows that for any unit vector v the matrix ρ of rank n − 1 such that (v, ρv) = 0 belongs to the inaccessible boundary point for the domain (v, ρv) > 0. Hence for a dense countable set of unit vectors {v j } we can conclude that (v j , ρ t v j ) > 0 for all j and t almost surely. Consequently (v, ρ t v) ≥ 0 for all v and t almost surely.

Remark 6
The methods developed can be used to extend this result to infinite dimensional H 0 . However, unlike the situation with counting observations, explained in Remark 2, there is some subtlety here in working with SDEs in the space of trace class operators, which we are not going to discuss in this paper.
A remarkable property of the SDEs (4.7) and (5.3) is that they preserve the pure states. Namely if the initial state ρ was pure, ρ = ψ ⊗ψ, then it remains pure for all times. Namely, one can check by a direct application of Ito's formula that if φ satisfies the SDE Another key observation is that there exists an equivalent linear version of (5.3). Namely assume that ξ solves the following Belavkin quantum filtering SDE for non-normalized states: where Y t is a Brownian motion under a certain measure. Applying Ito's formula to ρ = ξ/tr ξ one finds that ρ satisfies (5.3) with the process W satisfying the equation It follows from the famous Girsanov formula that if Y t was a Wiener process, then W t would be also a Wiener process under some different but equivalent measure with respect to one defining Y t . Hence a solution ξ t to the linear equation (5.10) with some Brownian motion Y t yields the solution ρ = ξ/tr ξ to (5.3) with some other Brownian motion W t .

Observations via different channels
Let us now extend the theory to the case of several channels of observation. Namely, we take Thus H is specified by k + 1 operators A, C 1 , · · · , C K in H 0 , so that H j are give by the formulas: At a starting time of an interaction the devices are supposed to be set to their vacuum states, so that a state ρ on H 0 = C n lifts to H as The observation procedure can be specified by choosing two orthogonal projectors P j 0 and P j 1 in the space C 2 of each device (that is in each channel of observation) arising from some observables with the spectral decompositions l λ l P j l . This choice yields the totality of 2 K orthogonal projectors in H, so that the possible new non-normalized states after each step of interaction and measurement arẽ and tr p1···K is the partial trace with respect to all spaces, but for H 0 . These states may occur with the probabilities where γ t and the probabilities p i 1 ···i K are given by (6.4) and (6.5). The transition operator of this Markov chain writes down as The operators in H are best described in terms of blocks. Namely, writing H = ⊕H i 1 ···i K , with H i 1 ···i K generated by H 0 ⊗e i 1 ⊗· · ·⊗e i K , we can represent an operator L in H by 4 K operators L The composition and partial trace in this notations are expressed by the following formulas: For simplicity let us perform detailed calculations for K = 2 (they are quite similar in the general case). Thus H = C n ⊗ C 2 ⊗ C 2 and H = H 0 + H 1 + H 2 . Let us denote the bases of the two devices {e k } and { f k } respectively. Formulas (6.2) rewrite in a simpler way as With the chosen vacuum vectors e 0 = (1, 0) in the first device and f 0 = (1, 0) in the second device, a state ρ on H 0 = C n lifts to H as The operators L in H are described by 16 operators L lm jk in H. To shorten the formulas, let us perform calculations without scaling C j (without the factor 1/ √ t) and will restore the scaling at the end. In term of the blocks we can write: where we have introduced the following notations: for i being 0 or 1 we denoteī as being 1 and 0 respectively. By (6.8) it follows that Therefore Next, and Thus, Thus all parts of (2.5) are collected. Let us turn to (6.3). From the calculations with a single channel we know that one has to distinguish diagonal and non-diagonal projectors P j k . Let us start with the case, when in both devises the projectors are diagonal, that is

Let us calculate
for arbitrary L.
We have Thus Thus we have Restoring scaling C → C/ √ t yields approximately and thus Thus p 11 = 0, .
Thus we get, up to terms of order h in small h, that Summarising and extending to arbitrary number of channels k we can conclude that we proved the following extension of Lemma 1.

Lemma 3 Under the setting considered,
for f ∈ C 2 (S(H 0 )), with L count given by As a consequence we get the following direct extension of Theorem 1.

Theorem 3
Let H 0 = C n and A, C 1 , · · · , C K be operators in H 0 with A being Hermitian. Let the projectors defining the measurements be chosen to be diagonal in each channel: for all j = 1, · · · , K . Then all statements of Theorem 1 hold for the operator (6.11) and Markov semigroups described by the transition operator (6.7). In particular, estimates (4.5) and (4.6) hold.

Remark 7
As explained in Remark 2 this result extends automatically to the case of arbitrary separable Hilbert space H and bounded operators A, C 1 , · · · , C K in it.
As in the case of a single channel, the process generated by (6.11) can be described by the solutions to the SDE of jump type, which takes now the form with the counting processes N j t are independent and have the position dependent intensities tr(C * j C j ρ). Equation (6.13) is the Belavkin quantum filtering SDE corresponding to the counting type observation via several channels.
As suggested by Theorem 2, exploiting non diagonal pairs of projectors P j 0 , P j 1 should lead to the limiting generator of diffusive type. In fact, performing similar calculations (which we omit) one arrives at the following general result.

Theorem 4
Let H 0 = C n and A, C 1 , · · · , C K be operators in H 0 with A being Hermitian. Let the projectors defining the measurements are chosen to be diagonal, that is of type (6.12), for a subset I ⊂ {1, · · · , K } of the set of channels. And for j / ∈ I these channels are chosen as non-diagonal, that is of the form 14) with φ j = kπ/2, k ∈ N. Then the limiting generator for the semigroup with the transition operator (6.7) gets the expression where W j are independent Wiener processes and N i t independent jump process of intensity tr(C j ρC * j ).

Proof
In the pure diffusive case, that is with empty I , the proof is exactly the same as in Theorem 2. For the general case one only has to show that operator L mi x generates a Feller process in S(H 0 ) preserving the sets of smooth functions (other arguments are again the same). Two proofs for proving this fact can be suggested. (i) One starts with generatorL mi x obtained from (6.15) by ignoring the jump part. This is a welldefined diffusion operator and by the same methods as in Theorem 2 one shows that it generates a Feller processes in S(H 0 ). But the jump part of (6.15) is a bounded operator preserving positivity and smoothness. Hence it can be dealt with straightforwardly via the perturbation theory. (ii) Each of the two parts of (6.15), related to I and its complement, generates a well-defined Feller process in S(H 0 ) preserving smoothness (of arbitrary order in fact). Hence one can derive that the sum of these operators generates a well-defined Feller process in S(H 0 ) via the Lie-Trotter formula, namely from Theorem 5.3.1 of [21].

Remark 8
The Markov chain of multichannel measurement that we are using is a bit different from the one used in [38], where measurement is based on a single operator R in the device (no different channels), and counting and diffusive parts of the generator arise from different projectors linked to different eigenspaces of this operator. As was already mentioned the method of [38] did not provide the rates of convergence.
When I is empty, L mi x turns to L di f describing the multichannel observations of diffusive type.

Fractional quantum stochastic filtering
Now everything is ready for our main result: the derivation of the fractional equations of quantum stochastic filtering. As was shown above the standard Belavkin equations of quantum filtering can be obtained as the scaled limits of the sequences of discrete observations. The main assumption for each of the approximating processes was that the time between successive measurement is either constant (discrete Markov chain approximation) or is exponentially distributed (continuous time Markov chain approximation). Of course there is no a priori reasons for these assumptions. And in fact in several domains of physics it turned out to be more appropriate to model times between successive events by random variables from the domains of attraction of a stable law, that is via CTRW.
Our next result is a direct consequence of Theorem 4 and Proposition 5.  (12.5) with the generator L = L mi x given by (6.15).
As noted at the end of Appendix C, the fractional derivative D β 0+ is a particular case of a class of mixed fractional derivatives (12.5). Therefore, under appropriately organised scaled times between the acts of measurements the limiting evolution will satisfy a more general fractional equation with D ν given by (12.8).
When only one type of observation channels is used, equation (7.1) simplifies to the case, when either L count or L di f are places instead of L mi x . Equations (7.1) (and their particular cases with fractional derivative D β of order β) represent the fractional analogs of the process of quantum stochastic filtering. These equations can be also considered as the new equations of fractional quantum mechanics. They are different from the fractional Schrödinger equations suggested in [31] and extensively studied recently. Equations (7.1) describe the process of continuous quantum control and filtering on the level of the evolution of averages. On the 'micro-level' of SDEs (6.16) these equations correspond to stopping the solutions of these SDEs at a random time σ t given by the inverse of a Lévy subordinator.

Fractional quantum control and games
The theory of quantum filtering reduces the analysis of quantum dynamic control and games to the controlled version of evolutions (6.16). The simplest situation concerns the case when the homodyne device is fixed, that is the operators C j and the projectors P j i are fixed, and the players can control the individual Hamiltonian H 0 of the atom, say, by applying appropriate electric or magnetic fields to the atom. Thus equations (6.16) become modified by allowing H 0 to depend on one or several control parameters. The so-called separation principle states (see [11]) that the effective control of an observed quantum system (that can be based in principle on the whole history of the interaction of the atom and optical devices) can be reduced to the Markovian feedback control of the quantum filtering equation, with the feedback at each moment depending only on the current (filtered) state of the atom.
In the present case of CTRW modeling of the process of measurements the problem of control becomes the problem of control of scaled CTRW. The theory of such control was built in the series of papers [27][28][29]. The main result is that in the scaling limit the cost functions is a solution of the fractional Hamilton-Jacobi equation. In the present context and in game-theoretic setting it implies the following. Let us consider the controlled version of the process O ρ σ t from Theorem 5, where the individual Hamiltonian is nowH 0 = H 0 + u H 1 0 + v H 2 0 and it depends on control parameters u, v of two players from compact sets U and V respectively. Suppose that it is possible to choose new u, v directly after each act of measurement, and thus a control strategy is the sequence (u 1 , v 1 ), (u 2 , v 2 ), · · · ) of controls applied after each act of measurement, with each (u j , v j ) applied after jth act of measurement and depending on the history of the process until that time. The case of a pure control (not a game) corresponds to the choice V = 0 and is thus automatically included. Assume that players I and I I play a standard dynamic zero-sum game with a finite time horizon T meaning that the objective of I is to maximize the payoff where J and F are some operators expressing the current and the terminal costs of the game (they may depend on u and v, but we exclude this case just for simplicity) and W is the collection of all noises involved in (6.16) (both diffusive and Poisson). Then under the scaling limit of Theorem 5 the optimal cost function will satisfy the following fractional HJB-Isaacs equation of the CTRW modeling of quantum games: In [27] this equation was derived heuristically, in the general framework of controlled CTRW by the dynamic programming approach. As usual in optimal control theory, to justify the derivation one has to show the well-posedness of the limiting HJB equation and then to prove the verification theorem, a classical reference is [15]. For some cases of CTRWs this was performed in [29].
In the present fractional quantum case this problem will be considered elsewhere. The additional complexity of this equation is related to the fact that the state space is a rather nontrivial set of positive matrices with the unit trace. One can reduce the complexity by looking at the dynamics of pure states only. But the set of pure states is not a Euclidean space, but a manifold. In the finite-dimensional setting this manifold is the complex projective space CP n .
Let us mention that in the non-fractional case, that is with the usual derivative ∂/∂t instead of D ν 0+ in (8.3), the well-posedness of (the analogs of) equation (8.3) was proved in [18], for a special model of pumping a laser with a counting measurement, with some particular solutions calculated explicitly, and in [24], for a special arrangements of diffusive measuring devises that ensured that the diffusive part of operator L di f was nondegenerate and therefore the optimal control problem was reduced to the drift control of the diffusions on a Riemannian manifold CP n .

Other Markov approximations and unbounded generators
We commented above on the possible extension to infinite-dimensional Hilbert spaces. However, for all approximations the assumption of boundedness of all operators involved seemed to be essential in the derivation given, at least of the coupling operators C j (unboundedness of A can be possibly treated via the interaction representation). However, the quantum filtering equations are used also in the standard setting of quantum mechanics. The mostly studied case is that of the standard Hamiltonian H = −Δ + V (x) in L 2 (R d ) and the coupling operators being either position (multiplication by x) or momentum operators. Different Markov chain approximations may be used to derive the filtering equation in this case.
A powerful approach was suggested by Belavkin in [9]: to use the von Neumann model of unsharp measurement. In this model the effect of measurement for the product state φ(x) f (y) of an atom and a measuring device, a pointer, is given by the shift Here both φ and f are from L 2 (R d ), and f > 0 describes the stationary state of a pointer (the analog of the vacuum state in our modeling above). Projecting on the state of an atom this yields the transition depending on the observed position y of the pointer. Assuming the evolution of the atom during time t between the moments of measurements to be given by a Hamiltonian A, the transition of a Markov chain of sequential measurements become After an appropriate scaling from this Markov chain one derives the diffusive filtering SDE (5.9) with C = x (the multiplication operator), that is directly the filtering equation for pure states, see detail in Appendix to [10]. The model can be extended to more general situations, but seems to be linked with a specific von Neumann instantaneous interaction. For the well-posedness of these kind of diffusive SDEs we can refer to [14,17] and references therein. The derivation of the fractional version of this equation, as well as the fractional control of Section 8 can be performed in this setting in the same way as above.

Appendix A. Convergence of semigroups
Here we collect the results on the convergence of Markov semigroups and CTRW, which form the the theoretical basis for our derivations of the filtering equations.
It is well known that the convergence of the generators on the core of the limiting generator implies the convergence of semigroups. We shall use a version of this result with the rates, namely the following result, given in Theorem 8.1.1 of [21].

2)
and Additional condition (10.3) makes working with discrete approximation a bit more subtle, than with the continuous chain approximations. Effectively to get (10.3) one needs a deeper regularity. Namely one should have another coreD such that D ⊂ D ⊂ B with L ∈ L(D,D) ∩ L(D, B). In this case it is easy to see that

Appendix B. Deterministic motions with random jumps
Let us look at the Cauchy problem with the simplest jump-type operator It is more or less obvious that the resolving operators of the Cauchy problem (11.1) form a semigroup of contractions in the space C(R d ) preserving the spaces of smooth functions. Let us make a precise statement. The simplest way to see it is via the 'interaction representation'. Namely, let X t (x) denote the solution to the Cauchy problemẊ t (x) = b(X t (x)), X 0 (x) = x, and let us change the unknown function f in (11.1) to φ via the equation f (x) = φ(X t (x)). Direct substitution shows that φ solves the Cauchy problem Since L t is a bounded operator, this Cauchy problem can be solved by the convergence series over the powers of L t . This leads to the following result. We need an extension of this result for the subsets of R d . The main tool is the following classical theorem of Brezis, which we formulate in its simplest form referring to proofs, extensions and history to [40].
for any x ∈ K , where d(z, K ) denotes the distance between a point z and the set K . Then K is flow invariant. More precisely, for any x ∈ K there exists a unique solution X t (x) of the equationẊ t (x) = b(X t (x)) with the initial condition x that belongs to K for all t.
As a direct consequence we get the following extension of Proposition 2.

Proposition 3 Let K be a convex compact subset of R d and b
: K → R d , Y j : K → K be twice continuously differentiable functions. Let b satisfy the assumptions of Theorem 6. Then the resolving operators R t of the Cauchy problem (11.1) form a semigroups of contractions in C(K ) such that the spaces C 1 (K ) and C 2 (K ) are invariant and R t are uniformly bounded operators in these spaces for ∈ [0, T ] with any T .
Suppose T h 1 , T h 2 , · · · is a sequence of i.i.d. random variables in R + such that the distribution of each T h i is given by a probability measure μ h time (dt) on R + , that depend on a positive (scaling) parameter h. Let Suppose X h 1 , X h 2 , · · · is a sequence of i.i.d. random variables in R d , such that the distribution of each X h i is given by a probability measure μ h space (dt), that depends on h. The standard (scaled) continuous time random walk (CTRW) is a random process given by the random sum In position dependent CTRW the jumps X h i are not independent, but each X h i depends on the position of the process before this jump. The natural general formulation can be given in terms of discrete Markov chains as follows. Let U h be a transition operator of a discrete time Markov chain O h n (x) in R d depending on a positive parameter h, so that with some family of stochastic kernels μ h (x, dy) such that U h is a bounded operator either in the space C(K ) with a compact convex subset K of R d or in the space C ∞ (R d ) of continuous functions vanishing at infinity. For our purposes we need only the operators of the type is a generalized scaled (position dependent) continuous time random walk (CTRW) arising from U h and μ h time . The CTRW were introduced in [37]. They found numerous applications in physics. The scaling limits of these CTRW were analysed by many authors, see e.g. [26,33,34].
The scaling limit for the position dependent CTRW was developed in [19]. Formally in [19] it was developed not in full generality, but for the case of the spacial process O h n (x) converging to a stable process. However, the arguments of [19] were completely general and did not depend on this assumption. The only point used was that O h n (x) converge in the sense of Proposition 1 (ii). For completeness let us formulate the result [19] in a slightly modified version that we need in this paper and present a short proof with essentially simplified arguments from [19] (see also Chapter 8 in [21]).
As an auxiliary result we need the standard functional limit theorem for the randomwalk-approximation of stable laws, see e.g. [16] and [34] and references therein for various proofs.

Remark 9
This proposition directly implies the following statement about the pro-in time differential equations. Namely, under the conditions of Proposition 5, the function f t (x) = E(F σ t f )(x) satisfies the equation (12.5) where D β 0+ is the Caputo-Djerbashian derivative of order β acting on the variable t, and the operator L acts on the variable x.
Recall that a Lévy subordinator is a process generated by the operator where ν is a one-sided Lévy measure, that is, it satisfies the condition min(1, y)ν(dy) < ∞. Proposition 5 is based on the central limit for stable laws stating the convergence Φ h t → S t of random walks approximations to a stable Lévy subordinator. If scaled random walks Φ h t are designed in such a way that they approximate an arbitrary Lévy subordinator, that is, Φ h t → S t with S t generated by (12.6), then similar arguments show that where σ y = max{t : S t ≤ y}, N h y = max{t : Φ h t ≤ y}.