Large deviations at level 2.5 for Markovian open quantum systems: quantum jumps and quantum state diffusion

We consider quantum stochastic processes and discuss a level 2.5 large deviation formalism providing an explicit and complete characterisation of fluctuations of time-averaged quantities, in the large-time limit. We analyse two classes of quantum stochastic dynamics, within this framework. The first class consists of the quantum jump trajectories related to photon detection; the second is quantum state diffusion related to homodyne detection. For both processes, we present the level 2.5 functional starting from the corresponding quantum stochastic Schr\"odinger equation and we discuss connections of these functionals to optimal control theory.


Introduction
The time-evolution of closed quantum systems is unitary, deterministic, and governed by Schrödinger equations. By contrast, open quantum systems (see e.g. [1,2] for reviews) are constantly interacting with their environment. In such cases, dynamics is no longer unitary due to dissipation and mixing effects, and to the flow of information into the (often infinitely many) degrees of freedom of the environment. Within Markovian and weak coupling approximations, such system dynamics are implemented by Lindblad (or Lindblad-Gorini-Kossakowski-Sudarshan) dynamical generators [3,4], which describe evolution under the assumption that the systembath interaction is not monitored in any way. The resulting quantum dynamics is deterministic and probability conserving but in general non-unitary.
However, modern experiments can monitor correlations between the system dynamics and the environment through suitable measurement processes [5,6,7,8,9]. For example, a single experiment can yield a time-record of observations, from which the behaviour of the system and the bath can be fully reconstructed. This time-record of events is stochastic because of the fundamental laws of quantum mechanics. It is associated with a quantum trajectory [10,11,12,13] that specifies the evolution of the system state conditioned on the given time-record. Averaging this state over all possible time-records (trajectories) recovers the dynamics generated by a Lindbladian. Going beyond the average, information about dynamical fluctuations is available by analysing stochastic quantum trajectories.
In this paper, we explain how to characterise large dynamical fluctuations of quantum stochastic processes by means of the theory of large deviations (LD) [14,15,16,17,18,19,20,21,22,23,24,25]. In particular, we present results about very general LD functionals which encode information about fluctuations of measurement outcomes. This includes general linear and non-linear functions of the quantum state of the system. We address the two main classes of measurement processes that monitor the interaction of a quantum system with its own environment. One class involves the detection of bath quanta emitted by the system -such as photons or particles -and gives rise to discontinuous quantum jump trajectories [11,26,27,1,2]. The other involves the continuous monitoring of homodyne currents associated to bath operators, which gives rise to quantum state diffusion [10,27,1,2]. We note that similar equations arise also when describing weak or strong measurements of system observables, see for example [28].
The functionals that we derive and discuss represent the counterpart of level 2.5 LD functionals in jump and diffusion processes [29,30,24,31,32] for quantum stochastic processes. A short presentation of the results for quantum jump processes has appeared before in [33]. We now present (in Sec. 2) an overview of our main results, including (in Sec. 2.5) an outline of the structure of the following Sections.

Scope
We consider Markovian open quantum dynamics in which the state of the system is described by a reduced (system) density matrix ρ(t) that is obtained by tracing out the environment. We focus on finite-dimensional quantum systems described by means of a Hilbert space C n , where n is the maximum number of orthogonal (basis) states of the space. The quantum state ρ is then a Hermitian n × n matrix with non-negative eigenvalues and Tr ρ = 1. In the Markovian limit, the dynamics of ρ is given by the Lindblad (or Lindblad-Gorini-Kossakowski-Sudarshan) equation [1,2,3,4] with where H is the system Hamiltonian and the L i are jump operators that depend on the coupling to the environment. This assumption is sufficiently general to cover the dynamics of several interesting quantum systems in contact with their environment [2]. We sometimes refer to the generator L as the Lindbladian.
From the density matrix, it is possible to compute all observable properties of the system. In this work we go further, by considering correlations between system observables and measurements that are made in the environment, as well as time-correlations in the stochastic system dynamics. Two general settings are considered: (i) correlations between system properties and the statistics of quantum jumps, corresponding to emission/absorption (for example of photons) into/from the environment, see Fig. 1(a); and (ii) correlations between system properties with the measurement of homodyne currents see Fig. 1(b). The theories for these two cases are different in their details, although there are common features. The case of quantum jump detection is discussed in Section 3 while homodyne measurements are discussed in Section 4.

Unravelling
Some correlations between system and environment can be analysed by tilted variants of Eq. (1), see [34,35,36,37,38,39,40,41,42]. Here we take a different approach, which is to unravel the joint dynamics of the system and the environmental measurements. This enables access to a larger set of dynamical observables and correlations. The theory is based on the stochastic evolution of a pure-state density matrix ψ t , which is a Hermitian n × n matrix for which one eigenvalue is +1 and the others are all zero. This matrix evolves by a stochastic process [10,2] which we write (schematically) as where dω t represents a random (stochastic) increment for ψ, see below for details. One could also consider the stochastic dynamics of a mixed (non-pure) density matrix. This would be necessary, for instance, in those cases in which the measurement performed on the quantum system can have degenerate outcomes. In these cases, we expect the general theory to be the same, however, some details, such as the space of states in which the stochastic process takes place, would need to be modified (see also Section 3.1). Averaging over the noise with suitable initial conditions for ψ, a general result is that where E is an expectation value for the stochastic process. Hence the (quantummechanical) average of any system observable A can be obtained as A(t) = E[Tr(Aψ t )].
Note that this construction makes use of a density matrix ψ that remains normalised at all times Tr(ψ t ) = 1. Other descriptions of the unravelled dynamics may be expressed in terms of states (or matrices) whose norm (or trace) is also time-dependent [2]. In what follows, we will construct probability distributions for ψ t , for which it is convenient that this object remains normalised.
Every trajectory of the stochastic process (3) is associated with a time-record for the environmental measurements. For example, if the measurements involve photon counting then the noise ω t causes jumps in ψ t , and the number of these jumps is, for example, the number of emitted photons. Writing N t for the number of jumps between time 0 and time t, this allows computation of observables such as This is an example of an observable quantity that depends on correlations between the system observable A and the environmental measurement N t , see for example Ref. [28].
The unravelled system also allows access to objects that are not immediately experimentally observable. In particular, quantum mechanical expectation values are linear functions of ψ but one may also consider objects that are non-linear. For example, in bipartite systems (decomposed into subsystems A and B), the entanglement of the (pure) density matrix ψ is where Tr A denotes a partial trace of subsystem A and χ(ψ) = Tr B (ψ). Then measures the average value of the entanglement shared by the two subsystems. This quantity, obtained as an average over time records, is nowadays receiving a lot of attention [43,44,45,46,47,48,49,50,51].

Large deviations at level 2.5
Our focus in this work is on large deviations of time-integrated quantities. A simple example would be N t , the number of emitted photons, as above. For large t, the distribution of N t is sharply-peaked, in the sense that its mean is proportional to t, while its standard deviation is proportional to √ t. Large deviation theory [17,18,19,20,21,22,23,24,25] can be used to analyse the rare events where N t differs significantly from its mean value, as t → ∞. The statistical properties of these events are described by large deviation theory at level 1, within the classification of Donsker and Varadhan [52,53,54,55].
Here we are concerned with large deviations at a more abstract level of theory, which is called in the LD jargon level 2.5. To explain this, we first consider level 2, which motivates us to define the empirical measure for the (pure-state) density matrix ψ. This is where we have introduced a Dirac delta function in the space of density matrices, see later sections for details. For a trajectory of (3), this µ t (ψ) measures (roughly speaking) the fraction of the time interval [0, t] that the system spent in state ψ. We assume throughout that the system has a unique stationary state, hence for large times µ t (ψ) should converge to some P ∞ (ψ), which is the steady state distribution for ψ. Large deviation theory at level 2 allows computation of the probability that µ t differs significantly from P ∞ . However, this level 2 theory is not sufficient for our purposes, for example it cannot capture the probability distribution of quantities like N t . The solution to this problem is to consider the joint statistics of µ t and the empirical fluxes Q t , which correspond to time-averaged jump rates for all possible quantum jumps. The precise definition of Q t depends on the structure of the noise term dω t in Eq. (3), see below for details. The level 2.5 theory states that the joint distribution of empirical measure and empirical fluxes behaves as where I 2.5 is an explicit rate function. The notation here a shorthand which indicates that the random variables (µ t , Q t ) should lie inside small sets that contain the values (µ, Q), and that the equality is valid on the exponential scale as t → ∞. For a rigorous mathematical formulation of LD principles, see for example [16]. Two central results of this paper (following [33]) are explicit formulae for I 2.5 for the two classes of Markovian open quantum system that were introduced in Sec. 2.1. These results generalise existing results for large deviations at level 2.5 in classical Markov processes [29,56,30,57,31,32].
Large deviation principles (LDPs) at level 2.5 have several applications. Two of the most important are: (i) they give a variational characterisation of large deviations at level-1; and (ii) they allow derivation of general bounds on fluctuations, such as thermodynamic uncertainty relations, see for example [58,59,60,61,62,63,64,65,66]. The connection to level 1 is discussed in detail below; the connection to thermodynamic uncertainty relations was discussed in [33], with a brief recap in Sec. 3.4.3, see also Appendix D.

Example two-state systems
We illustrate the abstract arguments so far by a simple two-state quantum system, that is n = 2. This might represent a single spin or a single qubit. We emphasise where superposition is not possible, the allowed states are only the classical spin configurations with spin pointing up | ↑ (north pole) and with spin pointing down | ↓ (south pole). Therefore, the empirical measure must be given by Dirac deltas in these two points. (b) In the quantum jump example the state is reset, at every jump, to the south pole and covers a deterministic pathψt until it jumps again. The empirical measure can thus be solely supported on such a path. (c) In the quantum state diffusion example, stochastic trajectories are supported on the surface of the Bloch sphere and thus the empirical measure is a function defined over it.
that none of our results are restricted to this case, but it is useful for illustrative purposes because it allows a simple representation of the empirical objects µ, Q.
In this case the most general pure-state density matrix can be represented as where (θ, φ) are the spherical polar coordinates of a point on the Bloch sphere. This means that the empirical measure µ can be interpreted as a probability distribution on this sphere. We briefly describe three types of two-state system, in preparation for the discussion in the rest of the paper. In a classical two-state system the only possibilities for ψ correspond to the poles of the Bloch sphere, which are θ = 0 and θ = π. Trajectories for ψ are restricted to the poles [hence b(ψ) = 0 in (3)], and they make discrete jumps from pole to pole, with randomly distributed times. In this case, the empirical measure µ always consists of two delta functions at the poles, with weights that indicate the time spent there, see sketch in Fig. 2(a). The empirical flux Q is a vector containing two numbers, which are the number of transitions from north to south, and the corresponding number from south to north. The large deviations of µ, Q can be derived from the classical theory for Markov chains at level 2.5 [29,56,30,57,31,32].
As a second example we consider a two-state open quantum system where a light source drives transitions between the states, and there is incoherent radiative decay from state |↑ to state |↓ . This corresponds to (2) with H = Ωσ 1 , with σ 1 being the first Pauli matrix, M = 1, and which is also the system pictorially represented in Fig. 1. In this case we explain below that µ is supported on a single line on the Bloch sphere, which corresponds to a deterministic evolution given by an effective (non-hermitian) Hamiltonian, starting from the south pole, as in Fig. 2(b). The empirical flux Q is a function over the path, parametrised in t byψ t indicated in Fig. 2(b), and provides the rates with which the state of the sytems at the different points in the path has jumped back to the south pole. These jumps correspond to radiative decay events. Finally, our third example is a two-state open quantum system coupled to a homodyne detector. We consider a fully dissipative dynamics with jump operators L j = iσ j , with j = 1, 2, 3, proportional to Pauli matrices. The average dynamics is known as the fully dephasing channel, however we discuss how single diffusion trajectories sustain non-zero average coherences at stationarity. In this case Eq. (3) corresponds to diffusion motion of ψ on the Bloch sphere, and the empirical measure is defined over the whole sphere [see Fig. 2(c)]. In contrast to the (relatively) simple cases considered so far, the empirical flux Q in this case is a more complicated object: it is related to the empirical current for the spherical diffusion. It turns out, however, that for homodyne quantum trajectories it is also necessary to introduce empirical characterizations of the noises. Details will be discussed below.

Structure of the paper
Having set the scene, we outline the structure of what follows. The statistics of quantum jumps are considered in Sec. 3, and those of homodyne currents are discussed in Sec. 4.
These two main Sections are similar in structure: after introductory material in Secs. 3.1 and 3.2 (respectively Secs. 4.1 and 4.2), the level-2.5 LD principles are presented in Secs. 3.3 and Sec. 4.3. Then, Sec. 3.4 discusses the relationships between the level-2.5 LD principle and previous results for quantum jumps at level-1, including the quantum Doob transform of [35,41]. An analogous discussion is given in Sec. 4.4 for homodyne currents. An example system with statistics of homodyne currents is discussed in Sec. 4.5.
Some of the material of Sec. 3 was presented in a shorter form in [33], while that of Sec. 4 is original. Compared to [33], the discussion of Sec. 3 is more comprehensive, particularly in regard to the connections between level-2.5 and level-1, and the similarities and differences between the Doob process of the unravelled system and the quantum Doob transform [35,41]. The parallel presentation of Sec. 3 and Sec. 4 emphasises the general structure of the theory.

Quantum Jump Processes
This section discusses the LD properties of quantum stochastic processes where the quantum state makes discontinuous random jumps. For example, such processes can describe experiments where a system emits photons that are detected by some measurement apparatus. When photons are detected, one infers that the system has made a transition into its ground state. A shorter account of these results was presented in Ref. [33], we also review some material from Ref. [35].
3.1 Pure-state density matrices and their calculus We introduce notation that will be important in the following. Recall that ψ t is the pure-state density matrix of the system at time t. This is a Hermitian n × n matrix. Denote the set of all Hermitian n × n matrices by M, and the set of purestate density matrices by M p (clearly M p ⊂ M). A generic member of M p has matrix elements ψ jk = z * j z k (11) where (z j ) n j=1 is a (state) vector with complex elements and n j=1 z * j z j = 1. (The notation z * indicates the complex conjugate of z.) For stochastic processes evolving mixed-state density matrices, the relevant space of states would be that of positive, unit-trace density matrices, M m ⊂ M.
The theory that we present is independent of the basis in which the pure state ψ is represented. However, it is natural to identify a set of classical basis states {|j } n j=1 so that |j corresponds to a state vector with z j = 1 and z k = 0 for k = j. The corresponding matrix ψ has ψ jj = 1 and all other elements are zero, it may be represented as ψ = |j j|.
Note also that (8) includes a (Dirac) delta function for the matrix ψ. To deal with this we must define integrals over such matrices. It is also useful to define gradients in M. We achieve this by treating each matrix element as a separate variable, see Appendix A. For a scalar function f = f (ψ) [that is, f : M → R] the gradient is a matrix ∇f with elements Also, given a matrix X ∈ M we define The theory required for integration is also outlined in Appendix A. All integrals with respect to ψ (or ψ ) are taken over M (although in most cases the integrand is non-zero only on M p ). The key results are dψ δ(ψ − χ)f (ψ) = f (χ) (14) and an integration-by-parts formula where X = X(ψ) is a matrix-valued function (that is, X : M → M) and its divergence is ∇ · X = jk (∂X jk /∂ψ jk ). Since the integrations cover the whole of M, there are no boundary terms in (15).

Unravelled stochastic dynamics of open quantum systems, and quantum jump trajectories
We now explain how the unravelled quantum dynamics (3) operates in systems with quantum jumps. Recalling Fig. 1, we identify each measurement time-record with a quantum trajectory, which specifies the state of the system at each time t, conditioned on the measurement outcomes obtained. An example of a measurement time-record is the detection plot in Fig. 1(a). The time records -and hence the quantum trajectories -are generated by a stochastic process, described by the Belavkin equation [10] dψ which is the unravelled equation (3), specialised to the jump case. Here, dψ t represents the increment of the pure quantum state ψ t in the infinitesimal time interval [t, t + dt] and with The dn it in (16) are random noise increments whose possible values are 0, 1; they account for the detection events, see Fig. 1(a). Only one event can occur in any infinitesimal time period, which means that dn it dn jt = δ ij dn it ; the average noise increment, conditioned on the state being in ψ t , is We emphasise that Eq. (16) describes the time-evolution of a matrix and must be interpreted as a set of equations for increments of matrix elements (dψ t ) jk .

Comparison of quantum and classical processes
Eq. 16 can describe both classical and quantum jump processes, on an equal footing. The relevant classical processes are Markov jump processes over the n classical basis states. They are specified by transition rates W (x, y) [from classical state x to the classical state y]. Their trajectories are piecewise-constant: the system remains in a classical state for a (random) time interval before making a discrete jump to some other (classical) state. Hence the allowed values of ψ t are the (discrete) classical states |j j| with 1 ≤ j ≤ n.
To describe the trajectories of these models one first sets H = 0 in (16). Then, for every non-zero rate one introduces a jump operator L xy = W (x, y)|y x|. This jump operator generates jumps of ψ t from |x x| to |y y|, with rate W (x, y).
[The indices i in (16) are replaced by indices xy, which label the types of jump.] With these conditions, by starting from a classical configuration state one has a classical state at every later time and B[ψ] = 0.
Quantum jump processes differ in several important respects from classical processes. First, ψ t can include quantum superpositions as well as classical states: this means that ψ t can take any value from the set M p . Second, trajectories for ψ t are piecewise-continuous instead of piecewise constant. In fact, the trajectories are piecewise-deterministic: ψ t evolves between jumps as (∂ψ t /∂t) = B[ψ t ] which may be solved as The jumps are discrete, as in the classical case. If a jump occurs at time t via the ith jump operator then the density matrix jumps as .
This means in particular that while classical jumps occur between discrete configurations, quantum jumps can occur between generic quantum superpositions. Given a system in state ψ, the jump rate into ψ (by channel i) is .
The δ function indicates that the final point of a jump is fully determined by the initial point and the channel. The fact that the quantum state evolves continuously between jumps also has consequences for the statistics of the times at which the jumps take place. In particular, the probability density function of times between jumps is exponentially distributed in classical jump processes but has a more general structure in quantum systems.

Unravelled quantum master equation
As discussed in Ref. [33], it is useful to derive a dynamical generator that describes the evolution of the quantum state given in (16). (The relevant theory is that of piecewise-deterministic Markov processes [1].) The generator for this process is a linear functional: (If f is a matrix-valued function then W acts separately on each matrix element.) The generator has the property We note from (17) that Hence, taking f (ψ) = ψ in (22) we find where L is given by (2). This L is a linear operator. Hence by (23), the time evolution of ρ(t) = E[ψ t ] is given by (1). To avoid any confusion associated with the notation in (25), we discuss briefly the object W[ψ]. An alternative notation in (22) would be to write Wf for the function obtained by operating with W on f , so the left hand side of (22) would be Wf (ψ). In this case one can define the identity function e by e(ψ) = ψ and the left hand side of (25) would be We(ψ). Throughout this work, that object is denoted by W[ψ].
Physically, we have shown that averaging the pure state ψ t over the trajectories of the unravelled dynamics generates the (mixed) density matrix of the open quantum system of interest. It is a non-trivial feature of these unravelled processes that the expectation value of ψ obeys a closed equation of motion. (The situation is similar to classical Ornstein-Uhlenbeck processes.) The process (16) also has a master equation, which is an equation of motion for the probability density for ψ t , which is denoted by P t (ψ). For a generic function f , Since this equation holds for all f , one obtains from (22) that which is the unravelled quantum master equation [33]. We define an adjoint opera- Note that (1) is known as the quantum master equation (QME), but the unravelled quantum master equation (27) is a completely different object. In particular, the unravelled QME describes the time-evolution of a probability density function, similar to standard master equations in the theory of stochastic processes. The QME describes the time-evolution of a density matrix, and has a different structure from standard master equations.

Steady state
We assume throughout that the Hamiltonian and jump operators in (16) are such that the process converges for long times to a unique steady state. This means in particular that for any initial condition P 0 , the solution of (27) tends to a unique long-time limit which we denote by P ∞ (see Ref. [67] for conditions on the uniqueness of this invariant measure for quantum Markov chains). The linear operator W has eigenvalues which are non-positive, with at least one zero. Since the state space M p is compact, the uniqueness of the steady state means that the zero-eigenvector of W is unique and that all other eigenvalues have (strictly) negative real parts. That is, W has a positive spectral gap.
The interpretation of P ∞ is the probability density for ψ t , in the steady state. We also define the joint probability density Γ for the initial and final points of quantum jumps, in the steady state. This is Also let Γ be a vector whose elements are the Γ i (for 1 ≤ i ≤ M ).

LD principle at level 2.5
We now formulate the level 2.5 LD principle for these systems, similar to (9). The empirical measure µ τ (ψ) was defined in (8). It follows from (8,14) that the trajectory-dependent quantity is the empirical time-average of f . We now define the quantity that plays the role of Q in (9). This is a vector of empirical jump rates, denoted by k τ . For a given trajectory, the empirical jump rate for channel i depends on the initial and final points of every jump in the trajectory; it is defined by where the sum is over all the quantum jumps of type (channel) i that occur in the trajectory; the jth jump is from ψ − j to ψ + j . Similarly to (29), integrals involving k i τ generate weighted sums over the jumps: for any function g(ψ, ψ ) then

Statement of LD principle
Since the system has a unique steady state and W has a positive spectral gap, it follows that weighted sums of the form (31) converge for large times to fixed (deterministic) values, as do time averages of the form (29). This can be summarised as follows: for τ → ∞ then with probability one (see also [30]). The LD theory describes rare events where this convergence fails. We state the relevant LD principle before sketching its derivation. The LD principle states that as τ → ∞ then the joint distribution of (µ, k) behaves as [This notation has the same meaning as (9), the left hand side is to be interpreted as the probability distribution for µ τ , k τ .] From (32) one must have I qu 2.5 [P ∞ , Γ ] = 0. Fixing (µ τ , k τ ) specifies the values of all quantities of the form (29,31). This means that the level 2.5 LD principle encodes the (joint) large deviation statistics of all such quantities. The function I qu 2.5 is finite only if the current and flux obey a continuity condition Assuming that this condition holds (and that µ is a properly-normalised empirical measure) one has where we have introduced the function Equations (33)(34)(35)(36) fully specify the level 2.5 LD principle for quantum jump trajectories. If the continuity equation (34) does not hold then we set formally I qu 2.5 [µ, k] = +∞, this means that (−1/τ ) log Prob[µ τ , k τ ] diverges as τ → ∞.

Derivation of LD principle
All LD principles in this work are derived by the same general method, based on the Gärtner-Ellis theorem [16,21]. We first define a moment-generating function (or functional) for the quantity of interest. In this case we consider the empirical measure and flux so we define a generating functional: Then by the Gärtner-Ellis theorem one has (modulo some technical assumptions that are always satisfied in the following): (39) Moreover, we show in Appendix B.1 that Θ[u 1 , u 2 ] may be characterised [33] as the largest eigenvalue of a tilted generator which is a deformed version of W in (22): For many large deviation problems, finding the largest eigenvalue of the tilted generator is prohibitively difficult. However, a key feature of level 2.5 is that the maximisation in (39) can be solved in closed form, yielding (34,35). This computation is described in Appendix B.2, it proceeds similarly to that of [30].

Comparison with level 2.5 for classical systems
It is useful to compare the LD principle (33) with corresponding results for classical Markov chains [29,30], For classical systems as described in Sec. 3.2.1, the empirical jump rate (by channel xy) is simply where Q τ (x, y) is the (classical) empirical jump rate: the number of jumps from the classical state x to the classical state y, normalised by τ . The corresponding jump rate (21) is Also, the empirical measure µ is non-zero only for classical configurations: where µ cl is the classical empirical measure, normalised as x µ cl (x) = 1. Substituting these facts into I qu 2.5 (µ, k) gives which indeed coincides with the classical level 2.5 functional [29,30]. (The sum runs over pairs of states for which W (x, y) = 0.) To summarise: in the quantum formalism described here, classical jump processes correspond to piecewise constant trajectories for ψ t , which takes values from a discrete set. In such cases (35) becomes the classical LD principle at level 2.5. The quantum case is more general because ψ t follows piecewise-continuous trajectories and can take any value in M p .

Auxiliary process (Doob transform, optimally-controlled process)
In LD theory, the rate function specifies the probability of rare events. It is also important to characterise the mechanism of these events -that is, the behaviour of trajectories with non-typical values of (µ τ , k τ ). The general LD theory explains that these (rare) trajectories can be characterised as typical trajectories of a different system, which we call here the auxiliary process. This Section characterises the auxiliary process associated with the LD result (33).
The derivation is related to a Doob transform and to optimal-control theory, see for example [24,25]. Note however: the auxiliary process that we describe here is associated to trajectories of the unravelled system, described by a Belavkin equation similar to (16). This is different from the quantum Doob process discussed in [35,41]. We return to this distinction in Sec. 3

.4.3 below.
There is a general recipe for identifying auxiliary processes, using the tilted generator [23]. For any such generator, we define the dominant eigenfunction as the eigenfunction corresponding to the largest eigenvalue. We focus on the tilted generator W u , and let f R = f R (ψ) be its dominant eigenfunction. Then the generator of the auxiliary process operates on functions f as For W A u to be a generator of a stochastic process, we require that its largest eigenvalue is zero and that the constant function f (ψ) = 1 is the associated eigenvector: W A u [1] = 0. This is easily verified for (44). Indeed, this equation allows the auxiliary process to be constructed, dependent on u 1 , u 2 and the associated eigenfunction f R . The generator of the auxiliary process has the same form as (22), but with the rates w i replaced by auxiliary rates w A (ψ, ψ ). To find the values of these rates associated to any given (µ, k) requires determination of the u 1 , u 2 that achieve the maximum in (39). This computation can be performed, formulae for w A are given in (147) of Appendix B.2. However, the final outcome of the computation can be obtained by direct physical reasoning, as we now explain.
By definition of the auxiliary process, the empirical jump rates k and the empirical measure µ are typical of its steady state. This means in particular that the mean jump rate from ψ to ψ must be This result fully specifies the auxiliary process for large deviations at level 2.5. It also gives a physical interpretation of the continuity constraint (34): the UQME for the auxiliary process is obtained by replacing w by w A in (27). Then (34) says that P t = µ must be a steady state of that equation, consistent with µ being the steady state of the auxiliary process. It is also notable that This measures the difference between the auxiliary rates and the original rates of the model. It states that the magnitude of the rate function is determined by the amount by which the rates w must be modified, in order to arrive at a model with the relevant (µ, k).

Full counting statistics of quantum jumps (LDs at level-1)
Since the level 2.5 LD principle encodes the probability for large fluctuations of all time-averaged quantities, it can be used to recover the statistics of total quantum jump rates, which are called full counting statistics. We show this explicitly, to indicate how the level 2.5 analysis can be applied. The total (empirical) jump rate for channel i is obtained by integrating the empirical rate k i over all initial and final statesk This jump rate obeys a level-1 LD principle, which has been derived in previous work [35,41] using methods based on tilted Lindblad operators. This Section shows that the same result can be obtained by contraction from the level-2.5 LD principle, it also explores the relationships between the tilted Lindblad approach and the level-2.5 method described in this work. Specifically, we review the tilted Lindblad method in Sec. 3.4.1, after which Sec. 3.4.2 shows that the same result can be derived from the level 2.5 LD principle. The relationships between the methods are discussed in Sec. 3.4.3, with a focus on the auxiliary process and the quantum Doob process.

Tilted operator approach
From (30), the integral (47) is the total number of jumps occurring by channel i in the whole trajectory, normalised by τ . Also letk = (k 1 ,k 2 , . . . ,k M ). For long observation times τ , the probability distribution of this observable obeys a LD principle Prob(k) exp −τ I 1 (k) .
To show this, we follow again the general recipe of Sec. 3.3.2. The SCGF is where λ = (λ 1 , λ 2 , . . . , λ M ) is a vector of parameters conjugate tok. The SCGF may be characterised [35] as the largest eigenvalue of a linear operator acting on matrices X ∈ M: For λ = 0 one recovers L † , which is the adjoint of the operator L defined in (2). (This adjoint is defined by the property that Tr[XL(ρ)] = Tr[ρL † (X)] for all Hermitian matrices X, ρ.) Then I 1 (k) can be obtained by Legendre transform [21,68,23,24] (In contrast to level 2.5, neither the SCGF θ k nor the rate function I 1 can be obtained in closed form.)

Level 1 full-counting statistics from the unravelled dynamics
We now give a different analysis of full-counting statistics, using the unravelled quantum dynamics (16). The idea is to characterise the SCGF θ k as the largest eigenvalue of a (tilted) generator for the unravelled system, similar to (40). Note that the SCGF θ k in (48) (40), it follows that θ k can be characterised as the largest eigenvalue of the tilted generator We now show explicitly that solving this eigenproblem for θ k is equivalent to finding the largest eigenvalue of (49). To this end, we first show that if θ is (any) eigenvalue of L † λ then it is also an eigenvalue of W λ . In this case we have . Also, it is easily shown [by analogy with (25) so that where the second equality uses that is an eigenmatrix of L † λ , and the definition of f . Hence this f is an eigenfunction for W λ with eigenvalue θ. However the converse does not hold: there may be eigenvalues of W λ that are not eigenvalues of L † λ . It therefore remains to show that the largest eigenvalue of W λ coincides with the largest eigenvalue of L † λ . For a general linear operator, we refer to the eigenfunction corresponding to the largest eigenvalue as the dominant eigenfunction.
From our assumption that (16) has a unique steady state, it follows that the dominant eigenfunction of W λ is always positive, f (ψ) > 0, and that this property is unique to the dominant eigenfunction. Moreover, the theory of Lindblad operators [41] shows that the dominant eigenmatrix of L λ has positive eigenvalues. Since ψ is a pure state (ψ = |z z|) then this implies f (ψ) = Tr( ψ) = z| |z > 0. So f (ψ) is an eigenfunction of W λ that is always positive -it must be the dominant eigenfunction. Hence the largest eigenvalues of L λ and W λ are both equal to θ k . The level-1 rate function can then be obtained from (50).
Finally, we observe one more way of characterising I 1 . By the contraction principle for LDs [16,21], one has where the infimum is taken over (µ, k), subject to (47). Admissible choices for µ, k in (54) also require that µ is normalised and that the continuity condition (34) holds. This minimisation was performed in [33], which verified that it is equivalent to (50). However, the approach here based on the tilted generator W λ is a more direct route to the same answer.

Auxiliary process and quantum Doob process
We now turn to the auxiliary process for full-counting statistics, which illustrates the physical connection of the unravelled dynamics to the quantum Doob process of [35,41], and hence to the tilted Lindblad operator. (The connections are summarized in Fig. 3, below.) In contrast to the level 2.5 LD principle where explicit results were available, LD results at level-1 rely on the solution to the eigenproblems discussed above. However, the auxiliary rates w A are available from (147) [in Appendix B.2], in terms of the dominant eigenfunction of W λ : they are The auxiliary process with these rates reproduces the rare (large deviation) trajectories of the unravelled process, as in Sec. 3.3.4. Similar to (44), the generator of this auxiliary process is In [35,41], a different kind of auxiliary process was identified, which we call here the quantum Doob process. It corresponds to a Lindblad equation of the form (1), where the Hamiltonian and the jump operators are both modified from the original model of interest. Specifically, the Lindblad generator of this model is given by [35,41] Fig. 3: Illustration of the relationships between the tilted generators L λ and W λ ; the quantum Doob process (described by Lindblad generator L D λ ); and the unravelled auxiliary processesψt and Ψt (described by classical generators W A λ and W D λ ). It is notable that averaging the auxiliary processψt does not yield a valid Lindblad evolution for E[ψt]. However, the transformation (59) yields an unravelled process Ψt that is related to the quantum Doob by E[Ψt] = ρ D , see Appendix C.
Using this L D λ in the Lindblad evolution (1) defines an open quantum system in which the HamiltonianH and the jump operatorsL depend on λ as [35,41] where h.c. denotes the Hermitian conjugate. (We recall that depends on λ.) This new system is the quantum Doob process. It is significant because typical timerecords of quantum jumps in the quantum Doob process match exactly the rare time-records that appear as large deviations in the original system. In this sense, the quantum Doob process plays the same role as the auxiliary process for the unravelled dynamics. The unravelled dynamics for the quantum Doob process may also be constructed. Let the pure-state density matrix of the auxiliary process of (56) beψ t and define a new pure-state density matrix: The trajectories of this Ψ t define an unravelled jump process which was shown in [33] to coincide with the unravelled dynamics of the quantum Doob process. This is verified in Appendix C. In other words, the unravelled dynamics of the quantum Doob system can be obtained by deforming the auxiliary process derived here, according to (59). The generator for this unravelled dynamics is denoted by W D λ , it can be constructed by analogy with (22), with the transformed Hamiltonian and jump operators from (58) used in place of the original H, L.
The relationships between the quantum Doob process and the various unravelled process are illustrated in Fig. 3. It is notable that the auxiliary process described by (55,56) cannot generically be interpreted as the unravelled dynamics of a system obeying Lindblad dynamics (1,2). (The Lindblad form places constraints on the unravelled dynamics which are not satisfied by generic auxiliary processes.) The transformation (59) is essential for relating the unravelled auxiliary processes to the quantum Doob transform.
An application of the level-2.5 LD principle for quantum jumps was considered in [33], which derived a thermodynamic uncertainty relation for photon counts, in the restricted setting of quantum reset processes. An expanded version of that derivation is given in Appendix D.

Other LDs at level-1
So far we have considered level-1 LDs ofk, which are full-counting statistics. These can be investigated either using the unravelled dynamics (via W λ ) or by a tilted Lindblad operator L λ . However, working with the unravelled dynamics allows other LD principles at level-1, which cannot be obtained by tilted Lindblad methods.
To see this, consider the fluctuations of a function O = O(ψ t ). Its time-average is In typical cases of interest, the function O(ψ) might be the quantum expectation of an operator X, e.g. O(ψ) = Tr (Xψ), which is a linear function of ψ. Non-linear functions can also be considered: for example, large deviations of the entanglement entropy of a bipartite quantum system were considered in [33]. The probability density for o τ obeys a LD principle with where φ(o) is the LD rate function. Similar to Sec. 3.4.2, this function may be obtained by contraction from level 2.5. Alternatively the SCGF for o τ is Θ[u 1 , u 2 ] from (38) with u 1 (ψ) = λO(ψ) and u 2 = 0. Hence this SCGF can be obtained as the largest eigenvalue of the appropriate operator W u and the rate function can be obtained by Legendre transform (with λ as the conjugate field). Furthermore, it is also possible to estimate this SCGF by using population dynamics methods [17,69] applied to the unravelled master equation (27) [70]. We are not aware of any general characterisation of such SCGFs in terms of tilted Lindblad operators.

Quantum Diffusion Processes
Many stochastic processes in classical physics are described by differential equations involving Wiener noises (or Langevin equations, or Brownian motions). In the large deviation context, these processes also obey LD principles at level 2.5. The ideas are similar to jump processes, but the technical details are different. In particular the empirical current plays the role of the flux Q in jump processes.
In the quantum context, homodyne measurements on open quantum systems result in random output signals that are related to Brownian motions, recall Fig. 1. (This is in contrast the photon-detection experiments which are related to jump processes.) We emphasize that the presentation of this Section is analogous to Sec. 3, with the addition of an example system that is analysed in Sec. 4.5. The LD principle at level-2.5 is presented in Sec. 4.3 and the connection to level 1 is discussed in Sec. 4.4, including the relation between quantum Doob process and unravelled auxiliary process. To set up those results, we briefly review level 2.5 functionals for classical diffusion processes [30,24], and we explain how these are generalised to the quantum case of homodyne detection experiments [2].
4.1 Summary of LDs at level 2.5 for classical diffusion processes As a generic classical diffusion process we take x ∈ R d evolving by a stochastic differential equation with n α independent noises: where A : R d → R d is a drift term, and B α ∈ R d is a vector indicating the strength and direction of noise α (assumed independent of x t ). The W α are independent Wiener processes with unit variance. We define a diffusion matrix with elements Here and elsewhere, sums over α are assumed to run from 1 to n α . The matrix D cl is assumed to be invertible [30] which requires (as a necessary condition) that n α ≥ d. We assume that this model has a unique steady state. In this section, the natural geometry is that of Euclidean space R d : gradients such as ∇f and dot products such as A · ∇f are taken in this space. [In later sections we revert to gradients in M, as defined in (12).] The generator for (62) is W diff which acts on functions f : Sums over i, j are taken over 1, 2, . . . , d. Similar to (26), the generator can be used to derive the Fokker-Planck equation for the time-evolution of the probability density for x t :Ṗ where J cl is the probability current which depends linearly on P t . Its elements are In the steady state one has P t = P ∞ and the associated probability current J cl,∞ = J cl (P ∞ ) is divergence-free: ∇ · J cl,∞ = 0.

Large deviations
To analyse large deviations at level 2.5, we define the empirical measure using (8), as above. We also define an empirical current as where the • symbol indicates that the integral uses the Stratonovich convention. For a given trajectory, J e τ (x) measures the displacement of the system at point x, summed over its visits to that point, and divided by the total time.
For large times we have a result analogous to (32), which is with probability one. The corresponding LD principle is which describes the joint statistics of empirical measure and empirical current. The associated rate function is finite only if ∇ · J = 0, in which case it takes the value as discussed in [30,24].

Empirical noise
Note the presence of the inverse of D cl in (69), so this matrix should be invertible to apply the theory as presented here. For quantum diffusions, the analogue of this matrix may not be invertible. This motivates us to modify the standard theory at level 2.5, as follows. We define an empirical noise which is the average noise increment for particles at x. (Note, this integral is taken in the Ito sense, so the empirical noise has mean zero.) Writing j for the vector of empirical noises, it can be shown that the distribution of (µ t , j t ) obeys an LDP [similar to (33)] (We only state this result here, we give the corresponding derivation for the quantum case below. That derivation is easily adapted to this case.) The rate function I cl noise is finite only if a suitable continuity condition ∇ · J = 0 holds; this can also be written as The object inside square brackets is the empirical current J for a system with empirical measure µ and noise j. It is the sum of J cl (µ) and a term coming from the empirical noise. In cases where the continuity condition (72) holds, the rate function is simply The level 2.5 rate function (69) can be obtained from this LD principle by contraction: one minimises I cl noise over all empirical noises j α that are consistent with a given empirical current. This minimisation yields the inverse of D cl in cases where it exists. (We note in passing that these discussions are not mathematically rigorous, in particular we have not stated precise technical conditions required on (62) in order to obtain this LD principle, although we do insist that the system should have a unique steady state [67]. See also the discussion of the quantum case, below.) Physically, the meaning of this contraction is that large deviations occur via the least unlikely noise realisations, and j τ characterises these noises. In cases where D cl does not have an inverse, there are some empirical currents that cannot be realised by any realisation of the noise. In this case the quantity J − J cl (µ) in (69) is outside the image of D cl and the rate function I cl 2.5 is formally infinite. Note finally, there is a thermodynamic uncertainty principle for currents in these systems [71,72], it is straightforwardly derived by setting µ = P ∞ and J = λJ cl (P ∞ ) in (69), which manifestly solves (72). This construction is valid only if D cl has the property that J cl = D cl F cl may be solved for F cl (for all x where P ∞ > 0). The simplest case is when D −1 cl exists but it is sufficient in general that the steady-state current J cl (P ∞ ) can be represented as α B α j α , so that there exist realisations of the empirical noise that generate a uniform acceleration of the steady-state current.

Homodyne detection experiments: unravelled dynamics
In the case of homodyne detection experiments, quantum systems are monitored by a continuous observation of the quadratures of the bath quantum operators. This type of measurement process allows a detailed characterization of the emissions of the system into the environment [37]. Indeed, since quadrature operators are proportional to the intensity of the light field, the measured homodyne current can provide information not only about the overall number of emitted photons, but also about the nature of the light, i.e. whether this is in a thermal, coherent or more complex state. In ideal conditions, the outcome of a homodyne experiment consists of a record of the time-integrated value of the measured current as a function of time, as shown in Fig. 1(b). These time-records are stochastic and depend both on the quantum state and on the specific realization of the noisy interaction between system and environment. In what follows we present a comprehensive discussion of the large deviations in these systems starting from the level 1 statistics for homodyne currents and then deriving the very general level 2.5 functional encoding the statistics of generic observable of the process.
Throughout this section, we consider a system described by the Lindblad evolution (1,2), as for jump processes. However, we slightly change our notation in that jump operators are labelled by m (with 1 ≤ m ≤ M ) instead of by i.

Stochastic Schrödinger equation
To describe homodyne trajectories, we consider a stochastic Schrödinger equation as in (3). This takes the form of an (Ito) stochastic differential equation similar to (62): where L is the Lindblad operator from (2) and where α m is a phase factor (see below) and the L m are the jump operators appearing in (2). In contrast to (62), the noise strengths K m depend on the state ψ. This means that we must take care to use Ito's formula when evaluating increments of ψ-dependent functions. In the literature on quantum diffusions [2], this is implemented by Ito rules The phases α m in (75) specify the particular quadrature operator of the environment modes that the experiment is monitoring [37], for each homodyne current. The Lindblad evolution (1) is independent of these phases but the unravelled trajectories can depend qualitatively on the α m . Equation (74) is a stochastic differential equation which describes every possible time-record of a homodyne experiment in which the state is being continuously monitored. In particular, a typical outcome consists of the values of the time-integrated homodyne currents Q m . These are random (trajectory-dependent) quantities, given by Let Q τ be a vector whose elements are the Q m τ . Comparing (74) with (62) one sees that L describes the drift of the diffusion process while K describes the noises. From the Ito rules (76) one sees immediately that E(dψ t ) = E(L(ψ t ))dt; using that L is a linear operator yields E(dψ t ) = L(E(ψ t ))dt. Recalling that E(ψ t ) = ρ t is the density matrix, one recovers (1). That is, the fact that the drift term is linear in the Ito equation (74)  Unless otherwise stated, we assume in the following that the unravelled process has a unique steady state in which the probability density for ψ t is P ∞ (ψ), as in the case of quantum jump processes.

Unravelled quantum master equation
The next step is to identify the generator for the stochastic process (74). We compute this at the level of the quantum state ψ. Consider a function f = f (ψ): its increment df in the short time interval [t, t+dt] is obtained by Taylor-expanding to second order: (It is implicit throughout this section that sums run over all allowed values of the relevant index.) Taking the expectation and using (76) yields where is the analogue of the classical diffusion matrix D cl in this setting (up to a factor of 2). Following that analogy, one sees that that if the number of terms in the sum (M ) is not large enough, the matrix D will be degenerate, and the inverse D −1 will not exist. Indeed, this situation is likely to be common for systems under homodyne measurement. Using (79,23) and recalling (13) we identify the generator for functions of ψ as which is analogous to the classical result (64). Taking f (ψ) = ψ recovers again that ρ = E(ψ) evolves as in (1). The analogue of (65) is the unravelled quantum master equation for diffusion processes:Ṗ where P t is the probability density for ψ. The corresponding probability current is a matrix-valued function of P , its elements are

LD at level 2.5 for quantum diffusions
We now derive a LD principle at level 2.5, following a similar method to Sec. 3.3. The empirical measure is given as usual by (8). Analogous to the classical case from Sec. 4.1, we define the empirical noises Note that The empirical current is Note that (86) includes a Stratonovich product, in contrast to the Ito products used elsewhere. Taking care with this fact we show in Appendix E.3 that the empirical current is fully determined by the empirical measure and empirical noise, as

Large deviation principle and auxiliary dynamics
We derive a LD principle for (µ τ , j τ ) noting that large deviations of (µ τ , Q τ , J e τ ) can then be obtained by contraction. To achieve this, we follow the same steps as Secs. 3.3.2 and 3.3.4. We give a short presentation of the computation, referring to those earlier sections for context and discussion.
Define a moment generating functional for (µ, j) that takes as arguments a 1 : M p → R and a 2 : The corresponding SCGF is The resulting LD principle is with I qu 2.5 (µ, j) = sup Recall, this last formula should be obtained by applying the Gärtner-Ellis theorem to (89). For a rigorous treatment, this would require technical conditions on Θ, which we do not explore here. From a physical perspective, we expect Θ to be wellbehaved as long as the unravelled system explores its (unique) steady state within some finite mixing time [67]. We assume that this is the case and the Gärtner-Ellis theorem can be applied -such a requirement is not trivial in systems where D is non-invertible, but we do not expect this to be too restrictive a condition in practice. This supremum can be computed exactly: we state the (simple) result before outlining the derivation. The rate function is finite only if a continuity equation holds: the empirical current J e during large deviation events must converge, for large times, to a current J which must be divergence-free, ∇ · J = 0, because the relevant trajectories are stationary. From (87), this requires (For compactness of notation, we omit functional dependence on ψ where this leaves no ambiguity.) In cases where (92) holds then Just as in the classical case (Section 4.1), a thermodynamic uncertainty relation can be derived in this system, if there exist choices of empirical noise such that m K m j m in (92) is proportional to the steady state current J(P ∞ ). One simply substitutes these noises in (93) with µ = P ∞ so (92) is easily satisfied. In cases where this construction is not possible, we are not aware of any thermodynamic uncertainty relation.
To derive (92,93), we show in Appendix E.1 that Θ[a 1 , a 2 ] from (89) is the largest eigenvalue of the tilted operator (94) The derivation of (93) from this operator is given in Appendix E.2, it is similar to that of Appendix B.2 for the jump case. Similar to Sec. 3.3.4, the auxiliary dynamics is explicit for level 2.5. It may be derived by identifying its generator as where f R is the dominant eigenvector of W a . This is similar to (44). Physically, the meaning of the auxiliary process is that the noise dW m develops a (ψ-dependent) mean value equal to (j m /µ). Hence (74) is modified in the auxiliary dynamics as Similarly, from (77) one sees that the homodyne current in this auxiliary model evolves as We recognise (92) as the condition that this auxiliary dynamics has steady-state distribution µ, similar to the discussion of Sec. 3.3.4 for jump processes.

LDs of homodyne currents (level 1)
Building on this analysis of level 2.5 LDs, we now consider LDs of homodyne currents. As in the case of jump processes, this will allow us to recover (and extend) earlier work that was based on tilted Lindblad operators [37]. We define the time-averaged homodyne current as Similar to Sec. 3.4, the level-1 LD principle for this quantity can be computed from the unravelled LD principle at level-2.5, or by using tilted Lindblad operators. The relation between the corresponding auxiliary processes and quantum Doob processes are also analogous to Sec. 3.4. Given that the physical picture is the same as that Section, the presentation here is brief.

Tilted operators
We analyse large deviations of q τ . Its SCGF is The operator L s is a multivariate generalisation of the tilted operator derived in [37]. (The definition of s used here differs from theirs by a factor of 2.) Analysis of this operator shows that the dominant eigenmatrix has positive eigenvalues, which means that the largest eigenvalue of W s is also the largest eigenvalue of L s , and therefore coincides with θ q (s). The operator L s was used in [37] to analyse large deviations of homodyne currents at level-1. Here we have shown how these large deviations can be analysed directly from the unravelled trajectories.

Auxiliary process and quantum Doob process for homodyne detection
Similar to the discussion of full-counting statistics in Sec. 3.4.3, one may construct an auxiliary model that reproduces the quantum trajectories associated with large deviation events. One may also construct a quantum Doob-transformed process similar to those in [35,41].
For the auxiliary process in the unravelled representation, one has from (194) (in Appendix E.2) that the empirical noise associated to the rare event is (We used that a m 2 = −s m and f R = Tr[ ψ], note that if s = 0 then is the identity and j m = 0, as required.) From (96) the auxiliary process for ψ is then For the quantum Doob process one follows instead the procedure given in [35,41]. The resulting Lindblad operator is This corresponds to a physical model whose typical trajectories allow reconstruction of the homodyne measurement records associated with large deviations event of the original model, analogous to the case of full-counting statistics. Following again the argument of Appendix C [using (101)] shows that the unravelled dynamics of the quantum Doob process can be related to that of the auxiliary process (104) by (59), just as in the case of full counting statistics. (Recall also Fig. 3.)

Example
We illustrate the application of the level 2.5 formalism with an example from a twolevel quantum system, corresponding to a quantum spin-1/2 particle. The space of states in this case admits a pictorial representation in terms of the Bloch sphere. In particular, pure states are parametrised by the spherical polar coordinates (θ, φ) as in (10). We show that the unravelled system corresponds to diffusion on this sphere, and we analyse large deviations in this case. The large deviations of the homodyne currents are quite trivial in this case, so we discuss instead large deviations of the (time-integrated) coherence of the unravelled quantum state.
We consider the dissipative Lindblad dynamicṡ where σ m is the m-th Pauli matrix. The unravelled trajectories are generated by (74), with α m = π/2 in Eq. (75). Hence Note that the stochastic term resembles unitary evolution with a random timedependent Hamiltonian given (formally) by m σ m (dW m /dt).

Diffusion on the Bloch sphere
For two-state systems, it is natural to write the density matrix in spherical coordinates, as in (10). The equation of motion can then be represented as This representation emphasises that the set of matrices M p can be parameterised by two real angles. Hence, instead of considering a probability density P (ψ) as in previous analysis, one may consider a simple probability density for θ, φ. This obeys a Fokker-Planck equation alternatively, noting that the uniform distribution on the sphere corresponds to P (θ, φ) = sin θ/(4π) one may define a covariant probability density p(θ, φ) = P (θ, φ)/ sin θ which evolves as We recognise the right hand side as the Laplacian in spherical co-ordinates. The meaning of this equation is that ψ undergoes isotropic diffusion on the Bloch sphere, and the steady-state distribution is uniform on the sphere. It follows that the steady-state of the system is time-reversal symmetric, and the probability current in this state is zero. We work primarily with P (θ, φ), the probability density for (θ, φ). In this representation, the probability current J is a tangent vector to the Bloch sphere, it has components along the azimuthal (φ) and polar (θ) directions. That is and from (110) one has J θ = 2[cot θP + (∂P/∂θ)] and J φ = −2 csc 2 θ(∂P/∂φ). From this point, the classical large deviation theory of Sec. 4.1 can be applied, as we see below.
We note however that this geometrical representation based on the Bloch sphere is limited to two-state quantum systems (n = 2). The general theory that we have presented is based on probability densities for matrix elements of ψ, which we have denoted by P (ψ). In this case the probability current is matrixvalued quantity, as in (83). Since such currents may not be intuitive, we give a brief physical discussion of how they appear in this case.

Currents in M
The key point from (108) is that while ψ has 4 elements (two real and two complex), a general increment of ψ can be described by a two-component vector (dθ, dφ), because ψ always remains in M p . The current is similarly described by the two components (J θ , J φ ).
There is a useful geometrical structure here: to illustrate it in a simple way, we consider a smooth curve on the Bloch sphere that is described by ψ(u) with 0 ≤ u ≤ 1. The restriction to smooth paths (and not Brownian motions) avoids complications from Ito's formula. The curve can be specified in terms of two functions (θ(u), φ(u)). Then where primes indicate derivatives, and E θ (ψ) = − sin θ 1 2 e −iφ cos θ 1 2 e iφ cos θ sin θ , E φ (ψ) = 0 − i 2 e −iφ sin θ i 2 e iφ sin θ 0 (114) are ψ-dependent matrices which form a basis for the tangent space to the Bloch sphere. It follows that the probability current at the point ψ is where E θ , E φ are the matrices from (113) and J θ , J φ are the scalar fields defined on the sphere as in (112). The empirical current also has a similar form. This general picture still holds true for systems with n > 2 states: the probability current is a vector field that is everywhere tangent to M p and can be written as J = α E α (ψ)J α (ψ) where the E α are matrices that form a basis for the tangent space and the J α are real-valued fields. Explicit construction of the basis matrices E and the currents J is not simple for large n (the tangent basis has 2(n − 1) independent components). This motivates our general formulation in terms of matrix elements of ψ, which is always applicable.

Large deviation analysis
Returning to the example (107), it is easily verified from (77) that dQ m = dW m , that is, the three homodyne currents are simple random walks. Hence their rate functions are simple quadratic functions. This is a general feature of systems where the operators e iα m L m are anti-Hermitian. We note that steady state of the Lindblad evolution in our example is ρ ∝ 1, the identity matrix, so there are no coherences in the steady state, at this level.
However, while the homodyne currents have Gaussian fluctuations, large deviations of the empirical measure µ can have more complex behaviour, and so can coherences within the unravelled system. We consider the coherence of the unravelled density matrix This object does have non-trivial fluctuations in the unravelled dynamics. We define its time averagec For large observation times τ , one has that The coherence also obeys an LD principle Prob(c τ ) e −τ I c (c τ ) with (by a contraction argument) where the infimum is subject to dψµ(ψ)C(ψ) =c. The rate function I c can also be obtained as sup s [−sc − Θ c (s)] where Θ c is the largest eigenvalue that solves − 2 ∂ ∂θ (P cot θ) + 2 ∂ 2 P ∂θ 2 + 2 csc 2 θ This problem can be solved numerically (see Fig. 4). We now obtain a simple bound on this rate function via level 2.5. This requires that we construct a suitable µ, j for use in (119). We use (87) to express the empirical current of our example system in terms of its empirical noises, as We require (∂J e θ /∂θ) + (∂J e φ /∂φ) = 0 in order solve the continuity constraint (92). In fact, the time-reversal symmetry of the original problem and the fact that the coherence is time-reversal symmetric means that the relevant large deviations are realised by trajectories with J e = 0.
We express µ as a probability density for (θ, φ), just like P . We consider a oneparameter family of (µ, j), in order to generate a range of values forc. Specifically, This is independent of φ, just like P ∞ (the coherence does not depend on φ so there is no reason why the auxiliary process should perturb its distribution away from that of the steady state). For λ > 0 the distribution µ λ is biased towards the poles of the Bloch sphere (less coherence) and for λ < 0 it biases towards the equator (more coherence). Then suitable empirical noises that achieve J e = 0 are with f (θ) = ∂µ ∂θ − µ cot θ. Physically, this empirical noise counteracts the tendency of the system to relax towards the steady state P ∞ , leading to trajectories with non-typical values of the coherence. Using these (µ λ , j), the value of I 2.5 can then be computed. To provide a bound on I c one must also compute dψµ(ψ)C(ψ). Performing the integrals numerically, the resulting bound is shown in Fig. 4, together with the (numerically) exact result obtained via (120).
The bound reproduces the general behaviour of I c but is not exact. The reason is that the ansatz (122) is not sufficient to fully capture the empirical measure of the rare trajectories that realise the rare event. However, it does have the right general form, especially for small λ. The fact that the rare trajectories have J e = 0 is also an accurate reflection of the rare event -one way to see this is that writing (120) as an equation for the covariant density p transforms the eigenvalue problem, into the (time-independent) Schrodinger equation for energy levels of a quantum particle diffusing on a sphere, with potential (s sin θ)/2. This is a Hermitian eigenvalue problem: these correspond generically to large deviation problems with time-reversal symmetry [19].

Outlook
We have analysed unravelled stochastic processes that describe the dynamical evolution of open quantum systems. In particular, we have derived LD principles for these systems at level 2.5, which provide an explicit and general characterisation of the joint fluctuations of the system and environment. We have explained how these LD principles are related to LDs at level 1, which can be analysed by tilted operator methods [35,37,42]. We have also discussed the implications of these LD principles for thermodynamic uncertainty relations.
This work opens up a framework for analysing fluctuating behaviour in open quantum systems. It provides a thermodynamic formalism where new interesting nonequilibrium phenomena, such as entanglement phase transitions [43,44,46,47,48,49,50] or scrambling of information in stochastic processes [73,74], can be investigated by means of concepts and tools that proved very powerful in equilibrium statistical mechanics. This makes it possible to look not only at average behaviour but also at the behaviour of higher-order time-correlation functions in quantum stochastic processes.
Since quantum trajectories have a direct relation with measurement outcomes in actual experimental settings involving continuously monitored systems, as for example the photon-counting or homodyne-detection experiments discussed here, this formalism also brings the theoretical investigation of nonequilibrium quantum systems closer to real observations. Given the general connection between large deviation principles and gradient-flow dynamics [75], there are also potential connections of this work to gradient-flow characterisations of Lindblad dynamics [76].

A Integration and differentiation in M
This section justifies the definitions in Sec. 3.1. The main requirement is to define a suitable integral over pure states. This is achieved by considering the space M of all n × n Hermitian matrices so that probability measures (etc) can be defined on this space. (In practice, all probability measures that we consider are supported on the smaller space Mp, but this does not affect the general theory.) In this section ψ indicates a generic member of M (that is, a Hermitian matrix but not necessarily a density matrix). It is specified in terms of n real variables (diagonal elements) and n(n − 1)/2 complex variables (off-diagonal elements). To integrate over M we must therefore integrate the n variables over the real line and the complex variables over the whole complex plane. This suggests that we should define the integration measure dψ as where d(z, z * ) indicates that the complex number z is integrated over the whole complex plane. However, since ψ * ij = ψ ji , we can write this (at least formally) as That is, instead of dealing explicitly with variables and their complex conjugates (as would usually be required in complex integration), we simply treat each element as an independent variable. Using the integration measure (125), it is natural to define (formally) δ(ψ−χ) = ij δ(ψ ij − χ ij ), bearing in mind that δ(ψ ij −χ ij )δ(ψ ji −χ ji ) is to be interpreted as δ(ψ ij −χ ij )δ(ψ * ij −χ * ij ), which satisfies the standard equality Hence (14) follows. A similar situation holds for differentiation. In order to consider real-valued functions with complex arguments one should generically write f = f (z, z * ) and consider derivatives with respect to both z and z * . For Hermitian matrices the natural first-order Taylor expansion (separating explicitly the real and complex variables) would be However, using again that ψ and δψ are both Hermitian, we write this as which we can abbreviate using (13) as Given these definitions, one may verify the integration-by-parts formula (15) by using the standard result B level 2.5 for quantum jumps

B.1 Tilted generator
We show how the SCGF for (µ, k) in quantum jump processes can be connected to the largest eigenvalue of a tilted generator. Given a trajectory and two functions u 1 , u 2 , we consider the time-dependence of functions of the form The increment of this function in a short time interval [τ, τ + dt] is (132) where ψ i + is the jump destination given in (20). The expectation of df is Hence we obtain from (133) that where Wu is the tilted generator (40). Now identify Pu,τ (ψ) = E[δ(ψτ − ψ)Zu(µτ , kτ )] as a reweighted (non-normalised) density for ψ. Setting h(ψτ ) = δ(ψτ − ψ) in (135) yields where W † u is the adjoint of Wu, specifically For large times, the linear differential equation (136) is controlled by the largest eigenvalue of W † u . Note also that Gτ of (37) is given by This provides the expected result: the SCGF Θ[u 1 , u 2 ] in (38) coincides with the largest eigenvalue of W † u (which is also the largest eigenvalue of Wu).

B.2 Large deviation principle
We solve the supremum in (39). Denote the quantity to be maximised by Taking functional derivatives, one sees immediately that Now recall that Θ is the maximal eigenvalue of Wu; denote the corresponding eigenfunction by f R . Then (141) Similarly let f L be the corresponding eigenfunction of W † u so that We normalise these eigenfunctions such that dψ f L (ψ)f R (ψ) = 1. Multiplying (141) by f L (ψ) and integrating with respect to ψ yields (143) (For compactness of notation, we omit the arguments of some functions, in cases where there is no ambiguity.) We now take a functional derivative with respect to u 1 . Note that the eigenfunctions f L , f R depend on (u 1 , u 2 ): the result can be written as Using that f R , f L are eigenfunctions of Wu, W † u respectively, the right hand side is Θ(δ/δu 1 ) dψf L (ψ)f R (ψ) which vanishes by normalisation of the eigenfunctions. Combining with (140) gives Similarly, differentiating (143) with respect to u 2 one obtains [using (140)] This is an important result: it says that the empirical jump rate from ψ to ψ can be expressed as is an auxiliary jump rate, see Sec. 3.3.4. Now multiply (141) by f L (ψ); also multiply (142) by f R (ψ); and subtract the results. We obtain which reduces [using (145,146)] to the continuity condition (34). It follows that finding a supremum in (39) requires that (34) holds. (In other cases the supremum is +∞ so the rate function is infinite.) Finally, combining (143) with (145,146) and (139) one obtains (149) Rearranging (146) we substitute for u 2 . After some manipulations we obtain where D was defined in (36). Using the continuity condition (34) one finds that the terms in the second line cancel each other; hence the maximal value of F is indeed given by (35), as required.

C Quantum Doob transform
We show that the unravelled dynamics of the quantum Doob process (57) coincides with Ψt defined in (59). It is sufficient to show that the expectation value of Ψt follows the Lindblad evolution of the quantum Doob process. Denote this average (under the auxiliary dynamics) by ρ D λ (t) = E[Ψt]; its time-dependence can be obtained from the generator W A λ . We interpret Ψ in (59) as a function, that is Ψ (ψ) = 1/2 ψ 1/2 / Tr( ψ). Then (23) yields and recalling (56) one finds Using that the generator is linear and also (52) gives Hence by (23) and using linearity of L λ Finally, re-expressing ψ in terms of Ψ and taking the expectation: from which we recognise d dt ρ D λ = L D λ (ρ D λ ) as required: the average of Ψ follows the quantum Doob dynamics.
Recalling Fig. 3, the unravelled master equation associated with the quantum Doob dynamics can be obtained in two ways. One option is to unravel the Lindblad dynamics L D λ . The other is to derive the unravelled generator W D λ , by considering the time evolution of Ψ . The remainder of this section uses this latter calculation to confirm that the two options yield consistent results.
By analogy with (23), the generator W D λ obeys This allows W D λ to be derived following a similar approach to (151), which amounts to a change of variables. Define a function g f (ψ) = f (Ψ (ψ)), so the left hand side of (156) can be expressed as E[W A [g f (ψ)]]. The generator (56) of the auxiliary process is The two terms on the right hand side of (157) will be considered separately. Using (13) and the multivariate chain rule, the first term is From (59) Following [33], the next step is to re-express this formula in terms of Ψ , to obtain W D λ . (58)] and which is the analogue of (17) for the unravelled quantum Doob dynamics. Then (159) can be expressed as Now consider the second term on the right hand side of (157): one uses (55), performs the integral over ψ , and re-expresses ψ in terms of Ψ , to obtain with [recalling (58) .
Finally using (161,162) with (157) accomplishes the change of variable from ψ to Ψ : we have with Taking the expectation of (163) and comparing with (156) confirms that this W D λ is the generator for the stochastic process Ψt. Since the jump operatorsL and the HamiltonianH are those of the quantum Doob dynamics, this shows that the same process Ψ can be obtained either by unravelling the quantum Doob dynamics, or by constructing the auxiliary process (45) and performing the change of variable (59). This establishes the status of W D λ as illustrated in Fig. 3.

D Quantum reset processes and thermodynamic uncertainty relation
The level 2.5 formalism can prove very useful when aiming at establishing bounds to large fluctuations of some time-integrated observables. In particular, it can provide thermodynamic uncertainty relations for jump rates [59], and for other types of observables which cannot be addressed by means of tilted operator techniques. Here we review the discussion of Ref. [33], explaining how these ideas work for quantum reset process.
A quantum reset process is a Lindblad quantum process where the destination state of each jump operator L i in (20) is independent of the initial point ψ of the jump. That is where ϕ i is a fixed matrix (the destination state). For such processes, the theory for unravelled processes simplifies considerably. Every quantum jump resets the system to one of the states ϕ i (for i = 1, 2, . . . , M ), we refer to these as reset states. From (19), if the last jump occurred at time t and was of type i then the state of the system at time t + t is Also, let S i (t ) be the probability that this system survives up to time t + t without jumping. This evolves in time as where is the probability to make a jump of type j in the time interval [t + t ; t + t + dt ]. (This is the product of the probability to make no jump in [t, t + t ] and the rate to jump at time t . It is normalised as j ∞ 0 dt p ij (t) = 1.) The functions p ij (t) and ϕ i (t) fully specify the quantum reset process. The survival probabilities S i (t) are related to the p ij by (167). We also define the (marginal) probability that the next jump is of type j, given that the last one was of type i: with j R ij = 1. Also let c i = E[k i τ ] = dψdψ Γ i (ψ, ψ ) be the total rate of jumps of type i (independent of the type of the last jump), recall (28). So With these definitions in hand, we now characterise the steady state of the quantum reset process. Since the evolution between jumps is deterministic as in (166), it follows that the steady state is supported on the deterministic paths that start from each reset state, that is The jump rate c i in the above formula is the rate at which the system jumps via channel i, i.e. jumps into ϕ i . With the above P∞, using the definition of c i , we can compute This is an eigenvalue problem for the matrix R. The normalisation j R ij = 1 and the fact that all R ij are non-negative mean that there is a solution with c i > 0 for all i, as required. This fixes the c i up to an overall multiplicative constant which can be found by insisting that P∞ is normalised in (171). It may be verified that P∞ is indeed the steady state of (27).
Physically, the statistical weight of ϕ i (t) (in the steady state) is the product of the rate of jumps into ϕ i and the probability to survive a time t before jumping again.
Following [33], we now derive a thermodynamic uncertainty relation (TUR) for quantum reset processes. This concerns full-counting statistics at level-1. (For other quantum TURs see also Refs. [77,78,65,79].) Combining (54,46) one sees that where the infimum is constrained such that the auxiliary process (with rates w A ) must have an average jump ratek, and µ is the steady state distribution of the auxiliary process. As explained in Sec. 3.4.3, the solution to this infimum is obtained when w A is given by (55), but we do not have explicit formulae for the quantities appearing in that equation. However, bounds on I 1 are available from (173), by choosing suitable auxiliary processes that solve the constraint.
The auxiliary process that we consider is also a quantum reset process. It has the same reset states and the same deterministic evolution, so the function ϕ i (t) remains the same. The analogue of the function p ij (t) isp ij (t). This fully determines the auxiliary quantum reset process and we define the correspondingŜ i (t),R ij (t) andĉ i by analogy with S i , R ij , c i from above.
From (168), one sees that the corresponding jump rate is (It is sufficient to specify this rate for states ψ within the support of P∞ so this fully determines the auxiliary jump rates.) The steady state distribution of the auxiliary process is µ, which is given by (171), with (c, S) replaced by (ĉ,Ŝ). Hence for this process one has from (46) that Using (167) one has j p ij = −(dS i /dt) and similarly forp,Ŝ. This allows the sum over j to be performed in some terms, yielding After some integrations by parts and using S i (0) = 1 and S i (∞) = 0, the second term on the right-hand side evaluates to zero. Moreover, this auxiliary model has average jump rates k i =ĉ i . Hence by (173), we find This is a generic bound on the rate function at level-1, as long as thep ij are chosen such that the resultingĉ i =k i . We now make a specific choice forp, aŝ where u ij , v ij are chosen to satisfyR ij = R ij , as we now discuss. (The idea to accelerate or reduce the jump rate, leaving the distribution of jump destinations invariant.) To find the relation between u and v, recall (169) and define This τ ij is the mean time between a jump of type i and one of type j, and σ 2 ij is the variance of this time. Then by (178) one haŝ We take v ij = 1 + u ij τ ij + u 2 ij (τ 2 ij − σ 2 ij )/2 which ensuresR ij = R ij (1 + O(u 3 )). By the eigenproblems (172) for c,ĉ, this means thatĉ i = c i (λ + O(u 3 )) for some constant λ (independent of i). This λ is the factor by which the auxiliary dynamics has been accelerated.
The conditionsR ij = R ij from above have been used to fix the v ij , but the u ij are still free. We choose these to achieve a uniform acceleration of all jumps, as usual in derivations of TURs. To achieve this defineτ ij analogous to τ ij and choose u ij such thatτ ij = τ ij /λ. This requires u ij = (λ−1)τ ij /σ 2 ij . Nowp ij is fully determined in terms of λ and the auxiliary process has average jump ratesk i = λc i . It follows that p ij log(p ij /p ij )dt = R ij τ 2 ij (λ − 1) 2 /(2σ 2 ij ), with a correction of order O(λ − 1) 3 . Using this in (177) yields where c is a vector whose elements are the c i , and Note that I 1 in (181) is the rate function from (173), whose argument is the vectork. The inequality (181) applies when the argumentk of the rate function is parallel to c.
Now consider an arbitrary linear combination of jump rates k b τ = b ik i τ and let b be the vector whose elements are the b i . Clearly E[k b τ ] = b · c. For large τ then k b τ obeys a central limit theorem whose variance can be obtained from the rate function I 1 , specifically, τ Var(k b τ ) ≈ b · H −1 b where H is the Hessian of I 1 (at its minimum). Since H is symmetric positive definite then the Cauchy-Schwartz inequality implies that (b · H −1 b)(c · Hc) ≥ (b · c) 2 . From (181) then (c · Hc) ≤ χ so one has a thermodynamic uncertainty relation For classical systems p ij is an exponential distribution so σ ij = τ ij and χ = ik i is the total jump rate, so (183) becomes a classical TUR [61]. For quantum systems one may have anti-bunching of jump events, leading to σ ij < τ ij . In such cases the variance of time-averaged currents can violate the classical thermodynamic uncertainty relation while still obeying (183). This theory was analysed for a simple model system in [33].
where Wa is the tilted generator (94). The argument that the largest eigenvalue of this operator coincides with Θ[a 1 , a 2 ] then follows exactly as in Appendix B.1.
The eigenvalue equation for Θ is and corresponding eigenfunction of the adjoint operator is f L and dψf L f R = 1. Hence (192) Differentiating with respect to a 1 , a 2 and using (190) yields (as before) that and also The continuity condition (92) Substituting for a m 2 using (194) and using the definition of D ij,hk one obtains the continuity condition (92).
It only remains to combine the ingredients, so as to compute the maximal value of F [a 1 , a 2 ]. Using (192,193) (197) Note from (194) that f L K m · ∇f R = j m − µa m 2 . Using this to substitute the term involving K m simplifies this equation: Integrating several times by parts and rearranging terms, we obtain (200) where we also used the definition of D in terms of K. The first term may be simplified using the continuity equation (92) and after one more integration by parts we obtain Finally using from (194) that a m 2 = (j m /µ) − (K m · ∇Λ) yields F = 1 2 dψ m (j m ) 2 /µ, as required for (93).

E.3 The empirical probability current as a function of empirical measure and empirical noise
We derive (87) which relates the empirical current to the empirical noise. It is convenient to introduce a function x : M → R and define a random (trajectory-dependent) variable Note Xτ is a matrix-valued quantity, as is J e τ . From (86) we have Xτ = 1