Stochastic Entropy Production: Fluctuation Relation and Irreversibility Mitigation in Non-unital Quantum Dynamics

In this work, we study the stochastic entropy production in open quantum systems whose time evolution is described by a class of non-unital quantum maps. In particular, as in Phys Rev E 92:032129 (2015), we consider Kraus operators that can be related to a nonequilibrium potential. This class accounts for both thermalization and equilibration to a non-thermal state. Unlike unital quantum maps, non-unitality is responsible for an unbalance of the forward and backward dynamics of the open quantum system under scrutiny. Here, concentrating on observables that commute with the invariant state of the evolution, we show how the non-equilibrium potential enters the statistics of the stochastic entropy production. In particular, we prove a fluctuation relation for the latter and we find a convenient way of expressing its average solely in terms of relative entropies. Then, the theoretical results are applied to the thermalization of a qubit with non-Markovian transient, and the phenomenon of irreversibility mitigation, introduced in Phys Rev Res 2:033250 (2020), is analyzed in this context.


I. Introduction
The entropy production is a key quantity in the study of open quantum systems (quantum systems in interaction with the external environment).On the one hand, it identifies the amount of energy that is irreversibly lost in terms of heat during the interaction process, and, on the other hand, it quantifies the degree of irreversibility of the resulting dynamics [1][2][3][4][5][6].In this paper, we adopt the point of view of stochastic quantum thermodynamics, so that the entropy production is defined as a stochastic variable, whose probability distribution and statistical moments can be computed by resorting to the Tasaki-Crooks formalism [7][8][9][10].According to such a formalism, the stochastic entropy production quantifies how much the joint probability distribution of two measurement outcomes at different times along a quantum evolution differs from the corresponding distribution in the time-reversed dynamics.
Since we are dealing with open quantum systems, the dynamics is not unitary and the definition of a time-reversed path is not straightforward.If the system and its environment are uncorrelated at time t = 0, the evolution at later times is described in general by a one-parameter family of Completely Positive and Trace Preserving (CPTP) maps [11].Many results in quantum thermodynamics deal with the so-called unital maps, namely with the special class of time evolutions that preserve the maximally-mixed state (multiple of the identity).In fact, for unital maps there is a natural notion of inverse dynamics given by the dual quantum map.Moreover, in this case the entropy production is known to depend only on the initial and final quantum states of the nonequilibrium process under scrutiny, not on the full history.As a result, on average the entropy production is provided by the difference of the von Neumann entropies in the final and initial states.In this regard, fluctuation relations and the statistics of thermodynamic quantities in unital quantum maps have been studied for instance in [12][13][14].It is worth mentioning in particular the case of unitary dynamics interspersed by projective measurements that also belong to this class [15][16][17][18][19][20].
However, a lot of physically interesting phenomena are described by non-unital maps, like thermalization, dissipation to a non-thermal state, and even feedback-controlled mechanisms as e.g.those of Maxwell demons.In this setting, the definition of a time-reversed protocol is less straightforward; however, a few proposals appeared in the literature [10,[21][22][23] so that the Tasaki-Crooks formalism equally applies to the thermodynamics of non-unital quantum maps.It turns out that, in general, for non-unital dynamics the entropy production has a more complicated structure than for the unital ones, both on average and at the stochastic level, where fluctuations are involved.Therefore, it is important to investigate whether simpler and physically informative results can still be proved for particular classes of non-unital maps (maybe with some restrictions on the initial state and/or on the measured observables).On top of this, since the formalism of quantum dynamical maps allows to consider both Markovian and non-Markovian dynamics (in the sense of [24]), one could also ask whether the degree of non-Markovianity plays a relevant role.In fact, despite the interest in non-Markovian effects related to quantum thermodynamics has increased in the last decade, and a few results have been established for average thermodynamic quantities, much less is known at the level of fluctuations.A non-exhaustive list of references about quantum thermodynamics in these contexts is [4,14,19,[25][26][27][28][29][30][31][32][33][34][35][36][37][38][39].
Motivated by the previous discussion, in this paper we study the stochastic entropy production in a class of non-unital quantum dynamics that describes, for instance, thermalization with a non-Markovian transient.In Section II, we describe the Tasaki-Crooks formalism by giving the definition of stochastic entropy production.In Section III, we present two main results.In particular, we show that, for the considered class of dynamics and for the set of observables commuting with the invariant state of the map, i) the stochastic entropy production obeys a fluctuation relation and ii) the average entropy production can always be written solely in terms of quantum relative entropies.We then comment on the difference between our results and the related work done in Ref. [40,41].In Section IV, we briefly introduce the concept of quantum non-Markovianity and then we discuss a thermalizing qubit dynamics with non-Markovian transient, computing explicitly the stochastic entropy production and its first two moments.We also compute the rate of variation of the average and of the variance of the stochastic entropy production.This allows us to find our third main result.Indeed, in Section V, as in the unital setting, iii) we find a parameter range such that, due to non-Markovianity, the average entropy production and its variance are simultaneously decreasing in the transient.This is the phenomenon we called irreversibility mitigation [37].Finally, in Section VI, we present conclusions and perspectives for future investigations.

II. Stochastic entropy production
Let us consider a d-level open quantum system whose evolution up to time t is described by a CPTP map Λ t .We consider the latter having a unique positive-definite invariant state π = i π i |π i π i | > 0, with probabilities π i ∈ [0, 1] and i π i = 1, such that the map is in general non-unital.We require the uniqueness of π because, in the following, we use a notion of time-reversed dynamics that is based on the invariant state.
In order to define the stochastic entropy production, which is the quantity of interest in this paper, we need to consider two distinct procedures, namely a forward and backward protocol, in accordance with the Tasaki-Crooks formalism [7][8][9][10].In particular, we resort to the two-point measurement scheme (TPM) [42][43][44]; this choice stems also from our intention to compare the results we are going to present with the ones determined in [37,40].In the time interval [0, τ ], the TPM consists in applying two projective measurements, at the initial and at the final time instants, of the two observables O in and O fin (Hermitian operators) [41,45].The spectral decomposition of the observables is where each {Π} identifies the set of projectors associated to each set of eigenvalues/measurement outcomes {a}.The forward process is defined by the following sequence of operations: where, after the first projective measurement, the state of the system is on average Applying the second projective measurement to the state ρ τ ≡ Λ τ [ρ in ] after the open quantum dynamics, one gets Hence, the joint probability that the measurement outcome a in m is followed by the measurement outcome a fin k in the forward process is Conversely, the backward process consists in the following set of operations: More explicitly, following [40], the main ingredient of the backward process is the dual map Λt .Given the invariant state π and the Kraus representation of Λ t , i.e., Λ is the set of Kraus operators and d is the dimension of the system's Hilbert space, the dual map is defined as Λt Here Θ is the antiunitary time-reversal operator obeying the relations Θ † Θ = ΘΘ † = I.By construction, Λt is CPTP and its invariant state is π = ΘπΘ † , namely the timereversed version of π.Additionally, in order to fully specify the operations in Eq. (3), we also set the first [second] measurement of the backward process to be the time-reversal of the second [first] measurement of the forward process, i.e., Πfin k ≡ ΘΠ fin k Θ † and Πin m ≡ ΘΠ in m Θ † .With this assumption, the joint probability that in the backward process the measurement outcome a fin k is followed by the measurement outcome The stochastic quantum entropy production is then defined as [37,41,45] ∆σ(a fin k , a in m ) ≡ ln In the non-unital case, instead, the forward and backward conditional probabilities are not equal to each other, and in general their expression depends on the details of the quantum map Λ t .

III. Fluctuation relation
Even in the non-unital case, one would like to determine if under some assumptions the ratio of conditional probabilities p F (a fin k |a in m )/p B (a in m |a fin k ) can be expressed in terms of quantities that do not depend on time.As we will show below, this is indeed possible in case the proposal of Ref. [10] is used to define the time-reversal protocol, and such a key aspect allows to derive a fluctuation relation for the stochastic entropy production in non-unital dynamics, like the ones describing thermalization.It is worth noting that other choices of the backward dynamics, as in Refs.[22,23], do not necessarily guarantee that the ratio of conditional probabilities p F (a fin k |a in m )/p B (a in m |a fin k ), although defined starting from the quantum map Λ t , does not depend on time.
To get the fluctuation relation, we make use of the concept of nonequilibrium potential -originally introduced in [40] -that is defined in terms of the invariant state π of the quantum map Λ t .Given the spectral decomposition π = i π i |π i π i |, we assign to each projector |π i π i | the nonequilibrium potential term Then, we focus on a class of quantum maps in Kraus representation where each Kraus operator E ℓ is a linear combination of "jump" operators |π j π i | associated with the same change ∆Φ π (ℓ) of the nonequilibrium potential.
As we are going to show below, this assumption is the first of a pair of conditions that are required to express ln p F (a fin k |a in m )/p B (a in m |a fin k ) as a function of the Φ π (i)'s, provided the TPM scheme is applied.Following the previous discussion, the single Kraus operator E ℓ (t) can be written as with the constraint that the coefficients Hence, all the transitions (i, j) that are contained in the same Kraus operator E ℓ (t) are characterized by the same potential ∆Φ π (ℓ); of course, there may be different Kraus operators that are associated to the same value of ∆Φ π .With these assumption, the following relations hold true at any time t: since where the last equality in Eq. ( 11) comes from using Eq. ( 8).One can use the relations ( 9) and (10) to compare the forward conditional probability with the backward one Note that in the second step of the second line we used the property Tr ΘAΘ † = Tr A † valid for antiunitary operators (the usual ciclicity of the trace does not hold).This property can be proved from the characterization of antiunitary operators (Θ antilinear such that Θψ, Θϕ = ψ, ϕ ) and their adjoint (Θ † antilinear such that Θψ, ϕ = ψ, Θ † ϕ ).When taking the ratio p F (a fin k |a in m )/p B (a in m |a fin k ), one can see from Eqs. ( 12) and ( 13) that the summation over ℓ (different Kraus operators) does not allow simplifications in general.A special case is constituted by unital maps, where all ∆Φs are vanishing, so that the ratio of conditional probabilities equals one.Dealing with non-unital maps, one needs a further assumption to express ln p F (a fin k |a in m )/p B (a in m |a fin k ) only in terms of ∆Φ π .In particular, the second assumption we make is that both observables O in and O fin commute with the invariant state π of the forward process.In other terms, one needs that [O in , π] = [O fin , π] = 0, such that the measurements applied at the initial and final times of the quantum process are described by the following projectors In this way, a specific transition between |π m and |π k is selected by the measurement, and this transition is in turn associated to a single value of the nonequilibrium potential, which can be then extracted from the summation.More explicitly, one can write where one has defined ∆Φ(k, m) ≡ Φ π (k) − Φ π (m) and the summation with the tilde runs only over the Kraus operators (labelled by ℓ) that include this transition.Therefore, under the validity of the aforementioned assumptions (i.e., (i) ), the stochastic quantum entropy production ∆σ(a fin k , a in m ) does not depend on the details of the evolution, but it just depends on measurement probabilities evaluated at the initial and final times of the process and on the eigenvalues π i , with i ∈ {1, . . ., d}, of the invariant state π.In particular, one has where ∆s(k, m) ≡ s(a fin k ) − s(a in m ), with s(a in m ) ≡ − ln p(a in m ) and s(a fin k ) ≡ − ln p(a fin k ).Thus, ∆s(k, m) denotes the difference of self-information of measuring the initial and final quantum observables, O in and O fin respectively, by recording the corresponding outcomes a fin k and a in m .Let us observe that ∆s and ∆Φ π have opposite sign; this indicates that, unlike the unital case, in non-unital quantum maps the total amount of entropy production in a single trajectory can be decreased during the process, e.g., by a mechanism with feedback aka a Maxwell demon with efficacy greater than unity.
It is worth stressing the role played by the assumptions (i) and (ii) in achieving Eq. (15).Assumption (i) is responsible for having a single value ∆Φ π (ℓ) for the change of nonequilibrium potential in correspondence of the ℓ th Kraus operator E ℓ ; it leads to the validity of the commutation relations ( 9)- (11).Instead, assumption (ii) allows to express the ratio ln Eq. ( 16) straightforwardly implies the first main result of this paper, which is the fluctuation relation describing a symmetry for the ratio of the forward and backward probability distributions of measuring the (k, m) th pair of outcomes from O in and O fin .Specifically, the right-hand-side of Eq. ( 17) depends on the measured outcomes at the initial and final times of the process, and on the invariant state π of the non-unital quantum map.Other details of the map are completely absent in Eqs. ( 16) and ( 17), albeit we have previously determined that assumptions (i)-(ii) have to be satisfied in order to get such equations.Moreover, another consequence of Eq. ( 17) is that the identity p F (a fin k , a in m ) = p B (a in m , a fin k ), which entails vanishing entropy production at the single trajectory level, meaning (strict) reversibility, is valid if and only if This is a very strong constraint which holds true when the initial state is the invariant state of the dynamics, so that there is no real evolution (the two measurements have no effect either, because we are assuming that the measured observables commute with the invariant state).Before concluding this discussion, we comment on a result in Ref. [41] that looks analogous to Eq. ( 16), and we explain why it is in fact a different result.In Ref. [41], the authors assume that they can access and measure both the system (with two observables identified by the projectors {P n }, {P * m }, in the notation of the cited reference) and the reservoir (with two observables having projectors {Q ν }, {Q * µ }).Each trajectory in their framework is thus specified by both the initial and final outcomes for the system and the initial and final outcomes for the reservoir.There, through the measurements performed on the reservoir, a unique Kraus operator, say M µν , is selected for each trajectory of the system conditioned evolution.The latter is indeed described by a quantum operation reading As a consequence, when computing the forward and backward joint probabilities, a single Kraus operator is involved, and thus a single value of the nonequilibrium potential is selected.Hence, without considering any further assumption, one can derive an expression for the entropy production that formally reads as our Eq.( 16), although each single trajectory carries also the information on the reservoir outcomes.At variance with the setting of Ref. [41], we consider the case where only the system can be accessed, and, consistently with such a description, each trajectory is determined by the initial and final outcomes for the system only.We are thus grouping together different reservoir trajectories of Ref. [41], so that many Kraus operators are involved in the computation of the joint probabilities Eqs. ( 2) and (4).Therefore, in general it is not possible to extract a single value of the nonequilibrium potential for each trajectory, this preventing one to derive Eq. ( 16).The novelty of our result is that, even in the case where one cannot associate a single Kraus operator to a trajectory (e.g., by only knowing the reduced dynamics instead that the conditioned evolution), the fluctuation relation can be restored by choosing a specific set of observables.It is indeed the measurement that in our case selects a single transition for any trajectory and allows the derivation of our result.
A. Average entropy production in terms of quantum relative entropies Now, we are going to provide the expression of the average entropy production solely in terms of quantum relative entropies.This is a consequence of the simple expression we found for the stochastic quantum entropy production ∆σ (Eq.( 16)).
To this end, let us consider the probability distribution of ∆σ, where δ[•] denotes the Dirac delta.Hence, the average of ∆σ reads as ∆σ = m,k ∆σ(a fin k , a in m )p F (a fin k , a in m ), whereby -after some calculations -it can be also written as where S(ρ||ρ ′ ) ≡ Tr[ρ ln(ρ) − ρ ln(ρ ′ )] denotes the quantum relative entropy, and the density operators ρ in , ρ fin and ρ τ are defined above in Eq. ( 1).This is the second main result of the present paper.A similar expression, i.e., ∆σ = S(ρ in ||π) − S(ρ τ ||π), was found in Ref. [41], thus without the explicit contribution of the second measurement that is the first term in Eq. ( 20).The full derivation of Eq. ( 20) is in Appendix A. Note that, as stated in Eq. ( 20), ∆σ is always non-negative, because the relative entropy is always non-negative and it is decreasing under the action of CPTP maps on both arguments (therefore the difference of the second and third term is non-negative).
It is worth observing that, if the invariant state of the non-unital quantum map Λ t is the thermal state with H the system Hamiltonian, then the difference of the nonequilibrium potential is ∆Φ m ), where the E i 's (i ∈ {1, . . ., d}) denote the possible energy values of the quantum system.i.e.
Also, for a thermal invariant state (this is the case we are going to study in Sec.IV), Eq. ( 20) becomes where S(ρ) ≡ −Tr[ρ ln(ρ)] is the von Neumann entropy.Therefore, one recovers a clear thermodynamic interpretation, with the (average) entropy production corresponding to the (average) entropy variation in the system plus the (average) entropy flux in the thermal bath (that is minus the inverse temperature times the (average) heat exchanged, i.e. −β∆Q =: Tr[H(ρ τ −ρ in )]).In this setting, all the variation of energy is usually attributed to a heat exchange because the Hamiltonian of the system is taken time-independent.It would be interesting to generalize the theory so as to include the case of a non-unital evolution with time-dependent Hamiltonian.In order to do this, one would need to modify the definition of the backward protocol including a dual map that depends on a instantaneous invariant state, i.e. π t such that Λ t (π t ) = π t .This is matter for future investigation.
IV. Case-study: Qubit thermalization with non-Markovian transient In this Section we consider an example to illustrate our previous findings, that is a non-Markovian thermalizing dynamics for a two-level system (qubit).
The study of non-Markovian quantum dynamics became an active area of research in the last decade [46][47][48][49].In the following, we adopt the nomenclature and definitions proposed in [24].In particular, we consider the intermediate propagators V t,s , implicitly defined by Λ t = V t,s Λ s , and distinguish different degrees of non-Markovianity based on its properties.If V t,s is completely positive (CP) for any pair t ≥ s ≥ 0 the dynamics is called CP-divisible or Markovian.If instead the propagators are just positive (P) the dynamics is called Pdivisible and a dynamics which is P-divisible but not CP-divisible is classified as weakly non-Markovian.We will be more interested in dynamics such that the propagator V t,s is not even positive for some pair t, s.In this case the dynamics is not P-divisible and it is called essentially non-Markovian.
The divisibility properties of the dynamics (and therefore its degree of non-Markovianity) are better studied by looking at the time-dependent generator L t = (∂ t Λ t )Λ −1 t , which exists provided the dynamical map Λ t is invertible and its time-derivative is well-defined.
In the following, we consider the prototypical example of a thermalizing dynamics for a qubit, which is described by the time-dependent generator L t explicitly written as Here σh with h ∈ {x, y, z} are the usual Pauli matrices, σ± = (σ x ± σy )/2, β is the inverse temperature, ω is the qubit energy gap and γ β (t) is a time-dependent decay rate.Using the Bloch vector representation of the qubit density matrix, i.e. ρ(t) = 1 2 (I + r(t) • σ), with σ ≡ {σ x , σy , σz } and r(t) ≡ (x(t), y(t), z(t)), one can rewrite the differential equation ∂ t ρ(t) = L t ρ(t) as a system of differential equations for the Bloch vector components The equations above, complemented with the initial conditions r(0) ≡ (x 0 , y 0 , z 0 ), can be readily solved, so that one can write the Bloch vector components at arbitrary time t ≥ 0 as follows, where we conveniently defined the integrated decay rate Γ β (t) ≡ 1 2 (1 + e βω ) t 0 γ β (τ )dτ and the parameter z ∞ ≡ − tanh( βω 2 ).From Eqs. ( 27)-( 29) one can verify that the unique invariant state for this dynamics is given by the Bloch vector r β ≡ (0, 0, z ∞ ), representing the Gibbs state ρ β (see Eq. ( 21)) with Hamiltonian H = ω/2 σz .We assume in the following that the time-dependent rate converges to a positive constant for long times, γ β (t) t→∞ −→ γ β > 0, such that the Gibbs state ρ β is also the unique asymptotic state for any initial density matrix.In this sense, the dynamics really describes the thermalization of a qubit to a thermal state, and the parameter z ∞ represents the asymptotic value of z(t), as the notation suggests.Note that with the special choice γ β (t) = λ(t)n β = λ(t) 1 e βω −1 one recovers a time-dependent generalization of the usual quantum optical master equation.However, we prefer to leave γ β (t) unspecified, since the following discussion does not depend qualitatively on this choice.In particular, at the end we may want to compare our findings with the results obtained in Ref. [37] for the unital case (infinite temperature, β = 0), but the quantum optical master equation is not well-defined in this limit, so that another choice of γ β (t) will be used.

A. Kraus representation of the quantum map
In order to show that the dynamics described by Eq. ( 23) does satisfy assumption (i), one has to find a Kraus representation corresponding to that evolution.This can be done considering the general expression ρ(t) ≡ Λ t [ρ(0)] = 3 j,k=0 λ jk (t)σ j ρ(0)σ k , where σℓ ∈ σ for ℓ ∈ {0, . . ., 3}, and λ ij (t) are 16 complex parameters, substituting the Bloch vector decomposition of ρ(0) and ρ(t), and using the algebraic properties of Pauli matrices to rewrite triples of sigmas into a single sigma.This allows to find a relation between the λ ij (t) and the Bloch vector components x(t), y(t), z(t).The functions of time x(t), y(t) and z(t) are closely related to the parameters λ ij (t) and they can be written as a function of them, as shown in Appendix B. As provided in Appendix C, the Kraus representation of the quantum map Λ t governing the thermalization of the qubit (with non-Markovian transient) can be achieved by expressing coefficients λ ij (t) as a function of the model parameters, i.e., Γ β (t), ω and β.In this way, one can get the Kraus operators E ℓ describing the quantum map in diagonal form Λ Specifically, we obtain (see Appendix C for the complete derivation) where σ+ ≡ |0 1|, σ− ≡ (σ + ) † = |1 0| and Moreover, u (j) i denotes the j th element of the vector u i , with i, j = 1, 2, where

B. Stochastic entropy production
In this subsection we evaluate the expressions of the stochastic entropy production and the corresponding statistics for a qubit interacting with a thermal bath in accordance with Eqs. ( 27)-( 29) entering in ρ(t) = (I + r(t) • σ)/2.The invariant state of the corresponding map is π = e −βH /Tr[e −βH ] with H = ωσ z /2.
Since for the chosen CPTP maps the equations of motion of x(t) ± iy(t) decouple from the one of z(t), when fixing x 0 = y 0 = 0 the dynamical evolution (after the 1 st measurement of the TPM scheme) remains diagonal with respect to the σz basis.Hence, the state ρ in is evolved at time τ as where z m (0) = ±1 for m = 0, 1 (we recall that m is the label of the initial measurement outcomes).Let us note that the outcome of the second projective measurement is a fin 0 , with probability p(a fin 0 ) = [1 + z m (τ )]/2, and a fin 1 , with p(a fin 1 ) = [1 − z m (τ )]/2.As a result, the values of the stochastic entropy production ∆σ(k, m), labelled by the index over the measurement outcomes, are provided by the following quantities: In order to derive the probability distribution of the stochastic entropy production, as well as the corresponding statistical moments, we need to evaluate for the forward process the conditional probability that the outcome a fin k occurs after the outcome a in m from the first measurement, i.e., p Considering the four possible combinations of outcomes, the conditional probabilities are Now we can provide the expression of the average entropy production.In doing this, for the sake of an easier presentation, we assume that the initial state of the forward process at time t = 0 is one of the eigenstates of σz , i.e., ρ(0) = 1 2 [I + z m (0)σ z ].Hence, the state after the first measurement prescribed by the TPM scheme remains unchanged: As a result, one gets: which allows us to derive a relatively simple expression for its time-derivative: The full derivation of ∆σ by initializing the system in a generic density operator ρ 0 is in Appendix D. Always under the assumption of p(a in 0 ) = 1, we can also derive the second statistical moment of the stochastic entropy production, i.e., and the corresponding variance: In Appendix D the reader can also find the analytical expression of the time-derivative of Var(∆σ) that will be employed in the next Section about irreversibility mitigation.

V. Irreversibility mitigation
As already discussed in the unital case, we want to find a connection between the non-Markovianity of the dynamics and the phenomenon we called irreversibility mitigation [37].More explicitly, we look for time intervals in which both the average entropy production and the variance are decreasing.In the perfectly reversible case, which is zero entropy production in a single trajectory, the distribution Prob(∆σ) is a Dirac delta in zero.Since the dynamics we consider are irreversible, the average of the distribution typically shifts towards positive values as time passes and the variance broadens (one no-longer has a delta distribution).Therefore, irreversibility mitigation stems for the fact that, due to non-Markovianity, in a certain time-interval the distribution Prob(∆σ) tends to get closer to a delta in zero.
We consider the dynamics studied in the previous Section and we choose the initial state to be |0 0| as before, so that the dynamics is completely described by the z component of the Bloch vector.From Eqs. ( 27)-( 29), using z 0 = 1 one can compute so that the sign of ∂ t z(t) is opposite to the sign of ∂ t Γ β (t) = 1 2 (1 + e βω )γ β (t).It is known that the sign of γ β (t) is related to the divisibility property of the dynamics and therefore to its degree of non-Markovianity (see the discussion at the beginning of the previous Section).In particular, the dynamics is CP-divisible and P-divisible (in this particular example the two notions coincide) if and only if γ β (t) ≥ 0 for any time t.This can be seen for instance from Example 3 in [24].
In order to determine the sign of ∂ t ∆σ , one has to study the sign of the quantity in brackets in (39), that we call I(t) in the following.It is possible to show that it is always non-negative in a few steps Therefore, for β positive, the sign of ∂ t ∆σ corresponds to the sign of γ β (t) and, as a consequence, the average entropy production is decreasing whenever the dynamics is not P-divisible (essentially non-Markovian).This is consistent with the general result obtained in [4], valid for arbitrary initial state.
Let us now focus on the derivative of the variance.We rewrite the last line of equation ( 67) for convenience The interest is in time intervals such that both the variance and the average are decreasing.Therefore, in particular, their derivatives have to share the same sign.This means the term in parenthesis has to be positive.A necessary but not sufficient condition is z(t) ≥ 0. Indeed, if this is not true the term is evidently negative.More explicitly, the condition reads By means of a somewhat lengthy calculation that we present in Appendix E, one can also find a sufficient condition to produce irreversibility mitigation: In order to make the comparison with the unital case we can consider a γ β (t) that admits a finite limit for β = 0.In particular, one can take γ β (t) ≡ γ(t), independent from β.In this case one has for infinite temperature In Ref. [37] we found Γ 0 (t) ≤ 0.091 for unital maps (note the different notation: Γ 0 (t) in this paper corresponds to Φ(t) in Ref. [37]).One could still refine this bound using the theory of Padé approximants [50] (see also Appendix E), however, our aim was to show that a parameter range in which irreversibility mitigation happens does exist.

VI. Conclusions
In this paper we studied the stochastic quantum entropy production in non-unital quantum dynamics (e.g., thermalization).We started recalling the Tasaki-Crooks formalism to introduce the stochastic entropy production, and then, as a first result, we studied the conditions whereby the latter obeys a fluctuation relation based on the so-called nonequilibrium potential, as in Refs.[40,41].The nonequilibrium potential brings information on the invariant state of the open dynamics under scrutiny, and, as we have shown, it is responsible for the unbalance of the corresponding forward and backward processes.In the case of a thermal invariant state, it has a clear interpretation in terms of energy fluxes.Then, the conditions such that the stochastic entropy production obeys a fluctuation theorem can be summarised as follows: i) the Kraus operators induce jumps with determined energy gaps, and ii) the measured observables commute with the invariant state of the map.In fact, with respect to Refs.[40,41] and in line with Refs.[37,45], we also clarified the role of the chosen measurement scheme (TPM) both in the definition of the stochastic entropy production and in the derivation of its statistics.This led us to write the average entropy production in terms of quantum relative entropies, with a contribution that explicitly depends on the second measurement.This is the second main result.
Subsequently, we recalled the concept of quantum non-Markovianity, by using a thermalizing qubit dynamics with non-Markovian transient as a case-study.Then, as a third novel result, we determined a parameter range (depending on the temperature of the invariant state of the map) allowing for the average entropy production and its variance to be decreasing at the same time in a transient, a phenomenon that we called irreversibility mitigation in [37].
Let us now discuss some possible outlook of our findings.First of all, our results are based on a construction of the backward protocol starting from the invariant state of the forward dynamics, which is assumed to be unique and strictly positive.It would be interesting to extend the framework in order to include cases with multiple and/or non-invertible invariant states.For instance, in the case of (infinitely) many invariant states one could construct a backward protocol for each of them and then ask if any of those protocols satisfies the condition on the Kraus operators that we used to prove the fluctuation theorem.Also, it would be worth understanding what is the relation between different choices of the reference invariant state.Even more strikingly, when the invariant state is not invertible one has to resort to a different method for building the reversed protocol, like for instance the one proposed in [21].The existence of fluctuation relations in this framework needs to be investigated further.Moreover, even in the setting of this paper, as already argued above, our derivations hold true for non-unital quantum dynamics satisfying certain constraints on the Kraus operators.For the moment, they have been tested on a single case-study provided by qubit thermalization maps with non-Markovian transient.This allowed us to observe how irreversibility mitigation modifies as a function of the temperature of an external bath.However, it would be interesting to study in detail different examples.In particular, it might be worth investigating whether our results apply to quantum maps (with a microscopic derivation) that are customarily considered in the scenario of open quantum systems (i.e.whether they admit the specific Kraus form we considered).Among them, we mention quantum collision models [51] and quantum Maxwell's demons [31,39,[52][53][54] as both deal with repeated interactions with external agents (or just laser pulses) over time, making non-unital the dynamics of the quantum system.Furthermore, one might consider how the nonequilibrium potential changes by employing measurement schemes beyond TPM [55][56][57][58][59], so as to understand the role played by quantum coherence/correlation terms in the initial state.Finally, as a long-term project, one might study if the irreversibility mitigation is favoured or hindered by increasing the size of the open quantum system under scrutiny.In this respect, spin-boson models like the multimode Dicke model could be a viable platform of investigation.

APPENDICES
Appendix A: Average entropy production in terms of quantum relative entropies: Formal derivation Let us compute the average value of the stochastic quantum entropy production, which reads as In Eq. ( 47) we can consider separately two contributions: we start by evaluating where ∆S (ρ,ρ ′ ) ≡ S(ρ) − S(ρ ′ ), with S(ρ) ≡ −Tr[ρ ln(ρ)] the Von Neumann entropy, and S(ρ||ρ ′ ) ≡ Tr[ρ ln(ρ) − ρ ln(ρ ′ )] is the quantum relative entropy.Instead, the second contribution to Eq. ( 47) reads as provided by Eq. ( 20) in the main text. ( where (a)-(d) are the conditions defined in Appendix B concerning the constraints of unit-trace preservation, Hermiticity preservation and positivity of the matrix λ.Analogously, comparing Eqs. ( 51) and ( 52) with ( 28) and ( 27), we get In conclusion, the coefficients λ i,j (t) read as Im[λ 03 (t)] = 1 2 e −Γ β (t) sin(ωt), and all the other coefficients vanish.Accordingly, the Kraus representation of the map Λ t is with σ+ ≡ |0 1| and σ− ≡ (σ + ) † = |1 0|.We can now write the quantum map Λ t in the diagonal form Λ In fact, from the first line in the last step of Eq. ( 56), we can already identify two diagonal operators, i.e., where one can verify that 2λ(t)(1±z ∞ ) ≥ 0, as Γ β (t) ≥ 0 and −1 ≤ ±z ∞ ≤ 1.In order to get also the operators E 3 and E 4 we diagonalize the remaining part of the map Λ t that reads as Thus, if we denote D and U the diagonalized matrix containing the eigenvalues of C and U the unitary operator that diagonalizes (i.e., spectrally decomposes) C so that C = U DU † , then the two remaining Kraus operators are respectively equal to and the corresponding eigenvectors read as where we have defined a As a result, the additional Kraus operators have the following expressions: where u (j) i denotes the j th element of the vector u i with i, j = 1, 2.
Appendix D: Qubit thermalization: Average entropy production & 2 nd statistical moment of ∆σ In this Appendix we report the complete derivation of both the average entropy production and the second statistical moment of ∆σ for the considered case-study of the qubit thermalization with non-Markovian transient.These calculations are carried out by considering a generic initial state ρ 0 but still under the assumption to apply the TPM scheme when defining the probability distribution of the stochastic entropy production.The notation and all the symbols appearing in the equations below have been already introduced both in the main text of the paper and in the previous Appendices.
Explicitly, the mean value of the stochastic entropy production is provided by ∆σ Furthermore, the 2 nd statistical moment of ∆σ reads as Given Eq. ( 41) for the entropy variance in the main text, valid for p(a in 0 ) = 1, we also write explicitly here the time-derivative of Var(∆σ): In order to find a sufficient condition such that z(t)I(t) ≥ 2, we use the explicit expression for I(t) given in the second to last line of (43) and bound it through the following inequalities [60]  This is a second order algebraic inequality for x = e −2Γ β (t) , which is satisfied when the variable x obeys x ≥ x + or x ≤ x − with x ± solutions of the corresponding algebraic equality 1 − e −βω ± e −2βω + 3e −βω + 1 .
(74) Since x − is negative the inequality e −2Γ β (t) ≤ x − is never satisfied and the only sensible constraint is e −2Γ β (t) ≥ x + .The value x + is always positive and one can show that it is always smaller than 1 for any positive value of β.More precisely, it is always between 4/5 (value for β = 0) and 4/5 (value for β → ∞) and monotonically decreasing in β.We have therefore a second bound that is sufficient to produce derivative of the variance and derivative of the average with the same sign As a consistency check, we can now compare the two bounds and see that the sufficient bound is always stricter than the necessary bound.This can be readily done One could also think of improving the sufficient bound replacing the rational functions in Eqs. ( 68),(69) with rational functions of higher order (Padé approximations [50]).However, the refined bounds would probably not be particularly enlightening.Our aim was to show that it is indeed possible to have a parameter regime such that irreversibility mitigation happens, and this is what we demonstrated.
pF(a fin k |a in m ) pB(a in m |a fin k ) in terms of the value of ∆Φ π that corresponds to the transition (k, m), with k and m labelling the measurement outcomes of O fin and O in respectively.