Response Operators for Markov Processes in a Finite State Space: Radius of Convergence and Link to the Response Theory for Axiom A Systems
- 1.1k Downloads
- 10 Citations
Abstract
Using straightforward linear algebra we derive response operators describing the impact of small perturbations to finite state Markov processes. The results can be used for studying empirically constructed—e.g. from observations or through coarse graining of model simulations—finite state approximation of statistical mechanical systems. Recent results concerning the convergence of the statistical properties of finite state Markov approximation of the full asymptotic dynamics on the SRB measure in the limit of finer and finer partitions of the phase space are suggestive of some degree of robustness of the obtained results in the case of Axiom A system. Our findings give closed formulas for the linear and nonlinear response theory at all orders of perturbation and provide matrix expressions that can be directly implemented in any coding language, plus providing bounds on the radius of convergence of the perturbative theory. In particular, we relate the convergence of the response theory to the rate of mixing of the unperturbed system. One can use the formulas derived for finite state Markov processes to recover previous findings obtained on the response of continuous time Axiom A dynamical systems to perturbations, by considering the generator of time evolution for the measure and for the observables. A very basic, low-tech, and computationally cheap analysis of the response of the Lorenz ’63 model to perturbations provides rather encouraging results regarding the possibility of using the approximate representation given by finite state Markov processes to compute the system’s response.
Keywords
Markov process Response theory Radius of convergence Perron–Frobenius operator Ulam conjecture Lorenz system1 Introduction
1.1 A Brief Summary of Response Theory
The development of methods for computing the response of a complex system to small perturbations affecting its dynamics is the subject of very active investigation in many fields of science and of technology. Statistical mechanics provides tools for approaching such a problem through so-called response theories, which allow for evaluating the change in the properties of a system through suitably defined operators that factor in the statistical properties of the unperturbed system and the specific nature of the perturbation one wants to study.
-
it is not physically consistent in treating the transition from equilibrium to non-equilibrium dynamics, because it studies the impact on equilibrium systems of perturbations that drive them near (but out of) equilibrium, but does not clarify how a new stationary state is reached and maintained; additionally, it is not suited for studying the response to perturbations of non-equilibrium systems;
-
it lacks mathematical rigour, as it is not clear which are the systems for which the response formulas apply, and why it should apply at all.
We can introduce the unperturbed evolution operator \(S_0^t=\exp (t \mathbf {F}\cdot )\), which moves forward in time any function of phase space \(O(\mathbf {x})\) by an interval t according to the unperturbed dynamics, so that \(O(\mathbf {x}(t))=S_0^t O(\mathbf {x}(0))\), and its perturbed counterpart \(S_\epsilon ^t=\exp (t (\mathbf {F} +\epsilon \mathbf {X})\cdot )\), which instead describes the evolution in the perturbed system.
Of course, at this stage one needs to bridge the gap between mathematical formalism and physical meaningfulness, One manages to bring Ruellle’s formalism into the realm of applicability by adopting the chaotic hypothesis [9, 10], which basically says that a high-dimensional chaotic physical system can be treated at all practical purposes as if it were Axiom A if we focus on macroscopic observables. The chaotic hypothesis is the generalisation of the ergodic hypothesis, and provides a firm background for translating the mathematical properties of Axiom A systems into physically meaningful statements. Clearly, the chaotic hypothesis applies far from regimes of metastability and far from critical transitions, where entirely different phenomena appear. The chaotic hypothesis might also be practically problematic in the case one treats multiscale systems featuring many near-zero Lyapunov exponents; see discussion in [11].
Taking the point of view of the chaotic hypothesis, one has that, after transients have died out, nonequilibrium systems reach a nonequilibrium steady state (NESS) where the phase space is on the average contracting (with the rate of contraction corresponding, broadly speaking, to the entropy production of the system [12]), so that one can associate to the hyperbolic strange attractor supporting the invariant measure a Hausdorff dimension that is lower that the dimensionality of the phase space and, in general, not integer [6, 13].
The last piece of the puzzle one needs to lay in order to sort out the above-mentioned criticisms to Kubo’s theory relies on the physical interpretation of how a perturbed equilibrium system reaches a steady state. A convincing point of view on this relies on emphasizing the role of thermostats, which are large physical systems interacting with the system of interest in such a way to extract the excess of heat generated as result of the energy input due to the perturbation. Thermostats are also responsible for making it possible the set-up of stationarity in the case of forced and dissipative non equilibrium systems. An extensive treatment of the role of thermostats in equilibrium and nonequilibrium systems in the context of the chaotic hypothesis is given in [14]. We will not elaborate further on this aspect here.
1.2 Transfer Operator Approach
One can point out that the formulas above describe the impact of and expressed in terms of expectation values of a generic observable O, whereas one might like to derive directly results for the impacts of the perturbations on the invariant measure.
In [3, 4, 5] one constructs the response of the system to perturbations by following the changes in the individual trajectories and summing over the possible initial configurations distributed according to the unperturbed invariant measure. A different point of view on response theory focuses on studying the properties of the unperturbed and perturbed transfer operators and of their generators (see [15] for an introduction on these mathematical objects), through the construction of an appropriate framework of suitable (Banach) functional spaces where their actions are well defined, able to carefully treat the fundamental differences between the (smooth) unstable and (singular) stable manifolds of the Axiom A systems [16, 17, 18, 19].
The evolution of the measure driven by the system \(\dot{\mathbf {x}}=\mathbf {F}(\mathbf {x})\) up to time \(t\ge 0\) starting from an initial condition at time \(t=0\) is described by the Perron–Frobenius transfer \(\mathcal {L}^t\) (see, e.g., [15]), so that \(\rho (\mathbf {x},t)= \mathcal {L}^t \rho (\mathbf {x},0)\). We have that the family of \(\{\mathcal {L}^t\}_{t\ge 0}\) forms a one-parameter semigroup, such that \(\mathcal {L}^{t+s}=\mathcal {L}^t\mathcal {L}^s\) and \(\mathcal {L}^0=\mathbf {1}\). The Perron–Frobenius operator \(\mathcal {L}^t\) is the adjoint of the evolution operator \(S^t=\left( \mathcal {L}^t\right) ^\top \), so that \(\langle S^t O,\rho \rangle = \langle O,\mathcal {L}^t\rho \rangle \), where \(\langle f,g \rangle \) is the action (computation of the expectation value) of the linear functional g (the probability measure) on the test function f (the observable). We have that \(\mathcal {L}^t\nu _0=\nu _0\) \(\forall t\ge 0\), meaning that the invariant measure is an eigenvector corresponding to unitary eigenvalue of the Perron–Frobenius operator.
One needs to emphasise that the transfer operator approach is more natural in all the cases when our interest focuses on studying the properties of the response of an ensemble of trajectories (initialised according to the unperturbed invariant measure) rather than on individual orbits of a system.
Note that in some applications there is not an obvious separation between the two approaches. Let’s take the problem of constructing climate projections through the use of (extremely complex) numerical climate models, which is one of the core activities summarized in the IPCC reports [27]. Indeed, modelling centers are actively pursuing the preparation of multiple runs starting from an ensemble of initial conditions for a given scenario of forcing in order to estimate more accurately the uncertainties in the projections. Nonetheless, we will not experience an ensemble of realizations of the climatic evolution, but just one.
1.3 Computing the Response
The analysis of high-dimensional complex system in terms of direct numerical simulation and of time series analysis suffers from the (almost) ubiquitous curse of dimensionality, which makes it hard to represent correctly the details of the dynamics because computational complexity explodes with the number of degrees of freedom. The construction of efficient and accurate algorithms for studying the response of a complex system to perturbations faces serious difficulties. Let’s focus now on the linear case. Some previous studies have emphasised the need for treating separately the contributions to the response coming from short and long-time delayed contributions in Eq. 3, and have underlined the need for reducing the complexity of the invariant measure by adding in the background state some stochastic forcing, able to smooth out the singularity of the SRB measure [28, 29].
A promising way to deal with the actual computation of the scalar product in Eq. 3 is to use as time-dependent basis the covariant Lyapunov vectors [30, 31], which automatically separate the contributions to the response coming from the unstable, neutral, and stable directions. This clarifies that the convergence of the formula given in Eq. 3 comes from the two distinct facts that (a) perturbations along the stable directions naturally decay, and (b) perturbations along the unstable directions grow in size, but are dominated by the loss of correlation due to mixing.
Recently, algorithms based upon adjoint methods have shown a good degree of accuracy and seem promising, even if scaling them up to high-dimensional systems has not been attempted yet [32, 33]. A different approach to the problem has been proposed in [7, 34, 35, 36], where, instead of trying to computing ab initio and directly the response given in Eq. 3, the authors construct it a posteriori, probing the system with some test forcings and using the formal properties of the theory to be able to predict the response for new patterns of forcings. One can say that by studying the differential response to similar yet differently modulated perturbations, it is possible to derive the overall response properties of the system.
1.4 This Paper
Any numerical representation of a continuum system builds upon the need of discretizing the phase space and, in the case of time-continuous system, of time.
Empirically, using long numerical integrations and defining the set of finite states \(\phi _i\), \(i=1,\ldots ,N\), we can construct the stochastic matrix \(\mathcal {M}_{i,j}\) describing the probability of performing a transition from state \(\phi _i\) to state \(\phi _j\) in a period of time \(\Delta t\). The same operation can in principle be performed using experimental and observational data. A fundamental issue at the core of such procedure is whether for some dynamical systems in the limit of finer and finer partitions covering the phase space (actually, the attractor of the system) with \(N\rightarrow \infty \) one reconstructs the actual invariant measure of the original system. See in [37] a comprehensive discussion of such an issue, the so-called Ulam conjecture, and in [38] some extremely promising applications of finite state Markov processes for studying severely reduced representations of complex systems.
Following the idea that the performing the discretization of the phase space amounts to adding a stochastic perturbation of the original dynamical systems, with intensity going to zero with the scale of the actual partitions, and exploiting the fact that the SRB measure can be constructed as zero-noise limit (with measure that is absolutely continuous with respect to Lebesgue) of the physical measure, in [39, 40] it has been proposed that the Ulam conjecture applies in the case of Axiom A systems, which are endowed with an SRB measure. The convergence in the case of Anosov diffeormorphism has indeed been proved provided one adds some noise of asymptotically vanishing intensity (through stronger than the noise induced by the partition itself) to the underlying dynamics [41]. Somehow this is not so surprising because by adding noise one introduces a cutoff below which partitions do indeed work. At any practical level, these results suggest that in the case of Axiom A system constructing finite state Markov processes using Ulam partitions can do a pretty good job in simulating the true dynamics, if one consider reasonably well-behaved, smooth observables as test functions. Nonetheless, one has to note that different choices for the partitions can lead to very different rates of convergence [37]. See also the discussion and the numerical examples presented in [42].
Apart from the Ulam method, one can follow a mathematically more elegant but practically much harder way to construct finer and finer partitions. As well known, Axiom A systems possess Markov partitions, i.e. well-defined, metric independent, finite resolution representations of the phase space that refine themselves with the dynamics [6, 14]. Such Markov partitions can be used to construct in the limit the actual SRB measure of the system, and, additionally, following [43], they provide a natural way to build finite Markov chains whose properties converge in the limit to those of the Perron–Frobenius operator of the system.
Having a response formulas in the finite case has direct relevance for finite Markov chains and for interpreting the results of reduced models. Another good reason to construct a response theory in a finite state space has to do with the fact that the response operators for Axiom A systems introduced by Ruelle can be written as expectation value of certain observables on the unperturbed SRB measure. Therefore, given what said above, one can hope to have convergence of the finite state reconstructed response operators to the corresponding true response operator in the limit of infinitely fine partitions of the dynamics. Actually, providing explicit formulas for the response operator for a finite state partition of a system the response operator and taking the limit for (suitably defined) finer and finer partitions could be interpreted as a rigorous way for constructing the actual response on the asymptotic SRB measure. One needs to note—see discussion in Sects. 2.1 and 3—that special attention has to be paid when studying the convergence of such operators.
-
our results are obtained using basic linear algebra operations in finite dimensional spaces, which can used to interpret more complex operators acting on infinite dimensional spaces. It is also possible to use the finite dimensional expressions to derive, e.g., the the actual response operators for continuous time Axiom A dynamical systems;
-
we are able to derive an explicit expression for the a lower bound to for the radius of convergence of the perturbative theory, and relate it with the mixing properties of the unperturbed system. We also find a (very tentative) expression for such a lower bound in the case of continuous time case Axiom A dynamical systems;
-
our formulas can be translated into one-line commands in now widely available software tools like R, Octave, or MATLAB \(^\circledR \). This might greatly facilitate the actual implementation of response operators. In particular, we can say that our results provide a direct translation of the response theory into a readily implementable algorithms.
2 Response Operators for Finite-State Markov Processes
Let’s consider an ergodic Markov process with a finite number of states defined by the N-component vector \(\mathbf {u}\). We consider the infinite Markov chain generated as \(\mathbf {u_0}\), \(\mathcal {M} \mathbf {u_0}\),\(\ldots \) \(\mathcal {M}^n \mathbf {u_0}\), \(\ldots \) where \(\mathbf {u}_0\) is the initial ensemble of states, and \(\mathcal {M}_{i,j}\in \mathbb {R}^{N\times N} \) is the stochastic transition matrix determining the probability of reaching the state i at step n if at step \(n-1\) we are in the state j. The process is taken to be stationary, so that \(\mathcal {M}\) does not change with n. We remind that \(\mathcal {M}\) is such that \(\sum _{i=1}^N \mathcal {M}_{i,j}=1\) and \(\mathcal {M}_{i,j}\ge 0\) \(\forall i,j=1,\ldots ,N\).
Our goal is to find a formula for expressing the change in the invariant measure resulting from perturbing the transition matrix \(\mathcal {M}\rightarrow \mathcal {M}+\epsilon m\).
2.1 Well-Posedness and Convergence
In the previous equations, we have used somewhat carelessly the expression \((1-\mathcal {M})^{-1}\). Unfortunately, the matrix \(1-\mathcal {M}\) is not invertible, because all of its columns sum up to zero, or, alternatively, because we know that 1 is an eigenvalue of \(\mathcal {M}\). Nonetheless, the expression makes sense if we apply it to a vector belonging to \({{\mathrm{span}}}\{\mathbf {u_2},\ldots ,\mathbf {u_n}\}\). We now want to prove that:
Lemma 1
If \(\mathcal {M}\) is a Markov transition matrix \(\mathbb {R}^N\rightarrow \mathbb {R}^N\) with eigenvectors \((\mathbf {u_1},\mathbf {u_2},\ldots ,\mathbf {u_N})\), and corresponding eigenvalues \((\lambda _1=1,\lambda _2,\ldots ,\lambda _N)\), \(1>|\lambda _2|\ge \dots |\lambda _N|\), and m is a matrix matrix \(\mathbb {R}^N\rightarrow \mathbb {R}^n\) such that \(\sum _{i=1}^n m_{i,j}=0\), then \(m\mathbf {z} \in {{\mathrm{span}}}\{\mathbf {u_2},\ldots ,\mathbf {u_n}\}\) \(\forall \mathbf {z}\in \mathbb {R}^n\).
Proof
Let’s consider the vector \(\mathbf {y}=m\mathbf {z}\). Its i th component can be written as \(y_i=\sum _{j=1}^N m_{i,j}z_j\). Since \(\sum _{i=1}^N m_{i,j}=0\), we have that \(\sum _{i=1}^N z_i= \sum _{i=1}^N \sum _{j=1}^N m_{i,j} z_j=0\).
Let’s now consider the k th eigenvector \(\mathbf {u}_k\) of \(\mathcal {M}\). We have \(\sum _{j=1}^N \mathcal {M}_{i,j} u_{k;j} = \lambda _k u_{k;i}\). Since \(\sum _{i=1}^N \mathcal {M}_{i,j}=1\), taking the sum over the i components of the previous expression, we obtain: \(\sum _{i=1}^N \sum _{j=1}^N \mathcal {M}_{i,j} u_{k;j} = \sum _{j=1}^N u_{k;j} = \lambda _k \sum _{j=1}^N u_{k;j}\). Therefore, either \(\lambda _k=1\), or \(\sum _{j=1}^N u_{k;j}=0\). We have that if \(k>1\), \(\sum _{j=1}^N u_{k;j}=0\).
We conclude that \(\mathbf {y}=m\mathbf {z} \in {{\mathrm{span}}}\{\mathbf {u_2},\ldots ,\mathbf {u_N}\}\) \(\forall \mathbf {z}\in \mathbb {R}^N\). \(\square \)
Remark
One needs note that finite numerical precision might cause troubles, so that one should be careful in eliminating any component along \(\mathbf {u_1}\) at each before applying \(\sum _{j=1}^\infty \mathcal {M}^j\). Note that we must use \(\sum _{j=1}^\infty \mathcal {M}^j\) expression for \((1-\mathcal {M})^{-1}\) in any code, because otherwise any software would give us automatically a NaN as error message.
Remark
We wish to underline another method for avoiding the \(\texttt {NaN}\) problem mentioned above. Following [45], we introduce the fundamental matrix of the Markov chain as \(\mathcal {Z}=(1-\mathcal {M}+\mathcal {M}^\infty )^{-1}\), where \(\mathcal {M}^\infty \) is the limit matrix whose columns are all equal to \(\mathbf {u}\). One can show that \(\mathcal {Z}\) exists as the operation of inverse is well defined given the spectral properties of \(\mathcal {M}-\mathcal {M}^\infty \) [39]. One can show that \(\mathcal {M}^\infty m \mathbf {{z}}=0\) \(\forall \mathbf {z}\in \mathbb {R}^N\). Therfore, in all the previous Eqs. 16-22 we can substitute \((1-\mathcal {M} )^{-1}m=\sum _{j=0}^\infty \mathcal {M}^j m= \mathcal {Z}m=\sum _{j=0}^\infty (\mathcal {M}-\mathcal {M}^\infty )^j m\).
The sensitivity of the unperturbed measure to perturbations given in Eq. 28 can also be cast in terms \(\rho _\mathcal {M}\), the smallest possible value for the constant controlling the rate of convergence of iterates \(\mathcal {M}\mathbf {e_i}\), \(\mathcal {M}^2 \mathbf {e_i}\), \(\ldots \), \(\mathcal {M}^n \mathbf {e_i}\) to \(\mathbf {u_1}\), so that \(\forall n\in \mathbb {N}_+, \forall i\in {1, \ldots N}\) we have that \(||\mathcal {M}^n \mathbf {e_i}-\mathbf {u_1}||_1\le C \rho _\mathcal {M}^n\), \(C\ge 1\) [46, 48]. The sensitivity diverges as \(\rho _\mathcal {M}\) approaches 1, i.e. when the unperturbed matrix has slow properties of convergence.
While the quantities \(||\mathcal {M}||_1^*\), \(\tau _{\mathcal {M}}(1)\), and \(\rho _\mathcal {M}\) are indeed different, they all point to the fact that if the mixing rate of the unperturbed matrix \(\mathcal {M}\) is slow—so that such quantities are close to 1 (so that \(||(1-\mathcal {M})^{-1}||_1^*\) and \(||\mathcal {Z}||_1\) are very large)—then the sensitivity of the measure to perturbations is high. See in [21] a discussion of the link between slow mixing of a system and the presence of rough parameter dependence in its response to perturbations, with some examples of applications in a geophysical context.
Bringing together the results presented in Eqs. 9, 10 and in Eq. 27, we conclude that Eqs. 18–22 provide the exact expression for the invariant measure of the stochastic matrix \(\mathcal {M}+\epsilon m\) \(\forall \epsilon \in \{[-\epsilon _{max}^*,\epsilon _{max}^*] \cap [\epsilon _-,\epsilon _+]\}\).
3 Response Theory for Observables
Let’s now look at the problem in terms of impact of the perturbation m on the expectation value of observables. Observables live in the dual space of the densities, and, given our convention, they are row vectors. They are approximated as having a constant value within each cell of the chosen partition of the phase space. The expectation value of the observable \(\mathbf {\pi }\) with respect to a measure \(\mathbf {w}\) can be written as \(\langle \mathbf {\pi },\mathbf {w}\rangle \), where \(\langle \bullet , \bullet \rangle \) denotes the scalar product. By definition, we have that \(\langle \mathbf {\pi }, A \mathbf {w} \rangle =\langle A^\top \mathbf {\pi }, \mathbf {w}\rangle \), where \(A^\top \) indicates the transpose (and adjoint, because we are studying real functions) of A.
Remark
Equations 22 and 31 provide at all orders the response formulas for the discrete Markov process studied here. If we are constructing empirically the discrete phase space, we expect that different choices of the partitions, corresponding to different approximate representations of the full dynamics, will deliver different results in terms of response. Hence, our results can be model dependent, which is reasonable, as we are starting from a subjective choice on the way we approximate the phase space. In fact, one can empirically test the robustness of the obtained results against a set of given criteria by comparing whether the perturbations to a certain set of relevant observables weakly depend on the specific partition used. We present a very preliminary (and encouraging) numerical study performed on the Lorenz ’63 model [44] later in Sect. 5.
Moreover, as discussed in Sect. 1.4, if we construct finer and finer partitions of for studying the response of systems whose unperturbed dynamics features an SRB invariant measure (most notably in the case of Axiom A systems), and indeed if we follow the self-refining Markov partitions of the dynamics, our results should converge to the exact response theory built upon the true SRB measure.
One needs to note that Eq. 27 gives an estimate of the largest possible value of \(\epsilon \) for a given partition, but we are are not sure whether the minimum over all the finer and finer partitions of \(\epsilon _{max}^*\) is positive—this corresponds to imposing the uniform—in N—bound on the norm of \(||(1-\mathcal {M})^{-1}||_1^*\) or \(||\mathcal {Z}||_1\).
In [39] it is shown that \(L^1\) convergence of the finite state measure constructed using the Ulam method to the actual SRB measure is realized when \(||\mathcal {Z}||_1\) grows asymptotically not faster than \(\log N\), where N is the number of states. The requirement we seem to have here for applying response theory here is unavoidably stricter because computing the response entails considering the expectation value of not necessarily well behaved observables, constructed through nontrivial operations of differentiation of the actual observables of which we want to study the sensitivity to perturbations, see Eq. 2 and [3, 4, 5]. This essential difficulty is exactly what motivates the point of view discussed in [18, 50], where a delicate analysis of the relationship between tangent space of the unperturbed dynamics, the perturbation flow, and of the observable allow to set up a robust framework for the response theory.
Similarly, in our case, making the response theory work at practical level means having/choosing m and \(\mathbf {u}\) in such a way that \(||(1-\mathcal {M})^{-1}||_1^*\) or \(||\mathcal {Z}||_1\) grossly overestimates in terms of norm the effect of applying \((1-\mathcal {M})^{-1}\) or equivalently \(\mathcal {Z}\) in, e.g., Eq. 22. Additionally, a suitable choice of the observable \(\pi \) can help avoiding potential singularities in Eq. 36. In other terms, response theory can work much more easily once we get rid of or cure pathological cases.
4 Towards Continuous Time Dynamical Systems
4.1 Linear Response
One needs to note that what in Ruelle’s formulation is causality (time integration in the response starts from 0), in the context of the Markov matrices formalism followed here comes from the algebraic expansion of \((1-\mathcal {M})^{-1}\). The issues of convergence mentioned in the original paper by Ruelle can be translated in the rate of mixing of the system as determined by the properties of \(\mathcal {M}\) discussed in Sect. 2.1.
4.2 Higher Order Terms
5 A Very Basic Numerical Experiment
Attractor of the Lorenz ’63 system with indication of the cartesian grids used for constructing the partitions of its phase space. See text
We have then identified a 3-dimensional box \(\mathcal {B}\) containing the attractor, defined as \(\mathcal {B}=\{(x,y,z)\in \mathcal {R}^3 |x\in [-20,20],\quad y\in [-30,30],\quad z\in [-0,50]\}\), and subdivided it, á la Ulam, in smaller boxes of identical size using a regularly spaced cartesian grid. We have considered partitions obtained using small boxes with linear dimension given by \(dx=2 \times j\), \(dy=3\times j\), and \(dz=2.5\times j\), along the three directions, with \(j=1,2,4\), see Fig. 1. This amounts to partitioning \(\mathcal {B}\) into \(8000/j^3\) smaller boxes. Note that our construction delivers a much lower resolution with respect to what used in, e.g., [55].
We run the model with standard values of the parameters choosing as initial condition \([1\quad 1 \quad 1]^\top \) (in fact, given the global attractivity and ergodicity of the Lorenz attractor, any initial condition can be chosen), and, after discarding a transient of 1000 time units, which brings us safely into the asymptotic regime, we run the model for 50,000 time units with a simple Runge–Kutta 4th order adaptive scheme and obtain the output with time step of 0.001 time units. This takes less than 10 minutes in a today’s commercial laptop with standard specifics using MATLAB \(^\circledR \). We present results at such a low level of sophistication in order to clarify that the appracch proposed here is rather robust and of relatively simple implementation.
Expectation value of the observables \(x^2\), \(y^2\), \(z^2\), and z and their linear response with respect to the perturbation \(\rho \rightarrow \rho +\epsilon \)
| \(\langle x^2\rangle \) | \(\langle y^2\rangle \) | \(\langle z^2\rangle \) | \(\langle z \rangle \) | \(\delta [x^2]_1\) | \(\delta [y^2]_1\) | \(\delta [z^2]_1\) | \(\delta [z]_1\) | |
|---|---|---|---|---|---|---|---|---|
| Lorenz ’63 Model | 62.9 | 81.2 | 630.0 | 25.6 | 2.8 | 3.7 | 50.3 | 1.01 |
| MC, \(j=1\), \(N^j_{B}=770\) | 63.2 | 82.0 | 630.5 | 23.6 | 2.9 | 3.8 | 50.3 | 1.01 |
| MC, \(j=2\), \(N^j_{B}=205\) | 64.3 | 84.2 | 632.2 | 23.6 | 3.0 | 3.5 | 49.7 | 1.02 |
| MC, \(j=4\), \(N^j_B=56\) | 71.3 | 84.8 | 637.5 | 23.5 | 2.9 | 3.9 | 50.1 | 1.02 |
For each value of j, the boxes \(B^j_k\) define the discrete states \(\phi ^j_k\), \(k=1,\dots ,N_B^j\). By counting the number of times the trajectory is included in each state \(\phi ^j_k\) and normalizing we derive experimentally the asymptotic normalized occupancies \(\bar{u}^j_k\). Instead, by tracking the transitions between the various discrete states, we construct the estimate of the stochastic transition matrix \(\mathcal {M}^j_{p,q}\) describing the probability that the state \(\phi ^j_q\) makes a transition to the state \(\phi ^j_p\) in one time step. By finding the eigenvector corresponding to the unique unitary eigenvalue of \(\mathcal {M}^j_{p,q}\), we find the invariant measure, which agrees up to very high precision with the empirical occupancy rate \(\bar{u}_k\) computed from the trajectory. As a first step, we evaluate the expectation values of four meaningful observables given by \(x^2\), \(y^2\), \(z^2\), and z, as obtained from the time integration of the Lorenz model and from its discrete representation in terms of Markov chain. Table 1 shows that the agreement is rather good even when extremely coarse resolution is used.
We then show how to compute the response of the system to the perturbation due to the introduction of the vector field \(\epsilon \mathbf {X}(\mathbf {x})\). We keep in mind that when continuous time dynamics is considered, there is a very simple linear relation between the perturbation flow and the corresponding perturbation to the Perron–Frobenius operator, see Eqs. 38–40.
Therefore, we repeat the the steps described above for the \(\epsilon -\)perturbed flow (we choose \(\epsilon =0.1\) in order to be on the safe side in terms of convergence), compute the new stochastic transition matrices \(\mathcal {M}^{j,\epsilon }_{p,q}\), and derive the perturbation matrices \(\epsilon m^j_{p,q}= \mathcal {M}^{j,\epsilon }_{p,q}-\mathcal {M}^{j}_{p,q}\). Once \(m_{p,q}\) and \(\mathcal {M}^{j}_{p,q}\) are known, we can use them to compute the response of the systems at all orders of nonlinearity using Eqs. 22 and 36.
One needs to note that because of the non-infinite integration time considered, of the non-infinitesimal perturbation applied, and of the somewhat arbitrary choice of the boxes, it can happen that the original and perturbed flow may be characterized by a different number of discrete states. We have observed such a difference only in the case \(j=1\), involving one single extra state for the perturbed flow, with normalized relative occupancy (\({\le }10^{-6}\)). This problem can be easily sorted out by imposing a cutoff and removing from the the discrete description all states with very low.
As discussed above, one needs to test accurately the well-posedness and convergence of the expansion in order to be sure to obtain meaningful results. This is not our goal at this stage given such a preliminary numerical test of our results. Therefore, we limit ourselves to the less ambitious yet interesting goal of computing the linear response defined in Eq. 32 for the observables indicated above, using Eq. 48. The results are reported in Table 1 and seem very encouraging. We have that the estimates of the response are very stable with respect to changes in the resolution of the boxes, and agree to a high degree of precision with the results one obtains by empirically evaluating the sensitivity of the observables with respect to the introduction of the perturbation flow using two integrations, as well, in the case of the z observable, with what reported in [34]. We note that the results are virtually unchanged if one uses instead of the high resolution time series with time step of 0.001 time units sparser observations corresponding to, e.g. a time step of 0.01 time units. Obviously, using a time resolution lower by a factor of s with respect to what considered here, one derives by tracking the transitions a stochastic transition matrix corresponding to the sth power of the one obtained at higher resolution. This does not affect the results as long as the sampling is much higher than the characteristic time scale of the system, which can be approximated in \({\sim }1/\lambda _1\sim 1.1\) time units, where \(\lambda _1\) is the positive Lyapunov exponent of the system. On much longer time scales, instead, the stochastic matrix is quasi-degenerate, with all columns almost equal to the invariant measure
6 Conclusions
Taking the point of view of finite state Markov systems, we have been able to construct a perturbation theory for studying the impact of small perturbations to the background dynamics. While previous approaches focus on the constructing a theory able to account for the effect of adding small perturbations to the baseline flow, we focus on computing the change in the invariant measure and for the change in the expectation values of general observables (one problem being the adjoint of the other) occurring when the Markov transition matrix \(\mathcal {M}\rightarrow \mathcal {M}+\epsilon m\).
The perturbation term \(\epsilon m\) has to be such that all the columns of the new stochastic matrix sum up to 1 and all entries are positive. All of our findings are obtained with rather simple linear algebra manipulations and using basic properties of the stochastic matrices. We can express the response as a perturbation series or, after suitable resummation, using compact exact formulas. We are also able to assess the convergence properties of the response theory by defining a value \(\epsilon ^*_{max}\) such that if \(|\epsilon |\le \epsilon ^*_{max}\) the perturbative expansion converges. We have that the stronger is the mixing of the unperturbed system, the larger is the value of \(\epsilon _{max}\). These findings match well with previous results providing upper bounds to the sensitivity of stochastic matrices to perturbations.
Our results provide a direct algorithmic method for studying the response to perturbations for finite state Markov processes and have the advantage of allowing for an immediate and practical change of point of view between response theory seen in terms of changes of the invariant measure or in terms of changes in the expectation values of observables, by simply computing the transpose of the resulting finite dimensional linear operators. Our findings give closed formulas for the linear and nonlinear response theory at all orders of perturbations through explicit matrix expressions that can be directly implemented in any coding language.
We can use our formulas to study the response to perturbations of finite state Markov processes constructed in order to have a simplified and treatable picture of a complex system. Given two different state spaces constructed using different finite partitions covering the attractor of the system, we cannot expect to obtain the same results for the change in the expectation value of a given observables. The results might indeed be model dependent, but this is the obvious price one has to pay because of the subjective choice of the reduced state space. An assessment of the robustness of the obtained results is key to applying our methods in the context of reduced models. Nonetheless, the extremely unsophisticated numerical study reported here on the Lorenz’63 model is quite encouraging at this regard, even if test should be made on much higher dimensional models.
If the underlying dynamics is Axiom A (or Axiom A equivalent, as in the cases where the chaotic hypothesis applies), one can impose conditions such that the response operators constructed using finer and finer partitions converge to to the actual corresponding response operators constructed on the SRB measure. Having in mind the Ulam method, the conditions are stricter than what needed in order to have convergence of the unperturbed measure, the basic reason being that Ruelle response operators correspond to nontrivial observables. One expects better convergence if the self-refining Markov partitions of the system are considered when constructing the finite state approximations.
Our results can be thought as intermediate steps at finite precision leading to the correct response formulas in the limit. One needs to add as a caveat that going from finite state to functional spaces is far from trivial and requires a high degree of mathematical precision, which is beyond the scopes of this paper. Nonetheless, the finite construction proposed here seems to somehow point at why some important mathematical issues emerge when the Perron–Frobenius operator formalism is considered in a continuum setting. In particular, the need for selecting suitable norms for vectors and linear operators in finite dimension points to the complex requirements in terms of functional spaces described in e.g. [51].
Interestingly, we can use the formulas obtained for finite state Markov processes to study the impact of perturbations to continuous time dynamical systems, after making a suitable identification between the considered transition matrices and the evolution operators for measures and observables. This operation is straightforward because there is a simple linear exact relation between the perturbation in the vector flow of the dynamical system and the perturbation in the Perron–Frobenius operator when infinitesimal time intervals are considered. As a result, we are able to derive in a very simple way previous formulas obtained studying the perturbations to the transfer operator as well as the original expressions proposed by Ruelle for the linear and higher order perturbations in the expectation values of observables. Using the results obtained in the finite state case, we propose a formula for the radius of expansion of the perturbative theory.
One can envision that in the case the underlying dynamics is discrete, there is not such a one-to-one correspondence between perturbations to the vector field and perturbations to the Markov transition matrix. This can be easily checked when constructing the perturbed Perron–Frobenius operator resulting from adding a \(\epsilon \) correction to the vector field, which results into changes in the Perron–Frobenius operator at all orders in \(\epsilon \). Therefore, the perturbative expansion is different in the two cases. Agreement is instead found in the limit \(\epsilon \rightarrow 0\), or, more practically, when we retain only the linear terms in \(\epsilon \) perturbative expansion, i.e. when aiming only at the linear response function.
Future investigations will try, on the one side, to have a sharper mathematical look at the problem of going from finite to infinitely small partitions of the phase space, and, on the other side, to delve in the numerical study of the effectiveness and efficiency of the proposed tools. Apart from testing the results on specific finite state Markov systems, we will test how robust the proposed methods are when studying finite state Markov processes that have been empirically constructed from time series of observations or of numerical simulations of high-dimensional complex systems. One may be led to hoping that it could be possible to have an accurate representation of the response of a high dimensional system to perturbations by constructing a smart finite state model well suited to studying specific observables of interest. Of course, in order to deal with the curse of dimensionality, one would like to be able to go beyond the Ulam method and deal with finite partition of reduced phase spaces where projection is applied on many or even most dimensions.
Our formulas may address the now long-standing problem of constructing suitable algorithms for studying the response of chaotic systems to perturbations. It is extremely hard to construct an algorithm for computing the (linear) response theory directly on the flow, because serious problems emerge when considering the contributions coming from the unstable directions in the tangent space. This might have great relevance for studying problems, like climate dynamics, where a direct construction of the response operator is especially challenging and slightly indirect methods have to be used [35] and a lot of effort has been devoted to defining the so-called atmospheric regimes and predicting their response to forcings [56].
Footnotes
- 1.
Most commonly Markov chains are constructed using row vectors; we use column vectors because we find it easier to perform formal matrix manipulations and because we are closer to the formulation most commonly implemented in scientific software.
- 2.
Following [51], one might tentatively consider the norms of the operator acting between the Banach spaces \(\mathcal {B}_{2,q}\) and \(\mathcal {B}_{1,q+1}\).
Notes
Acknowledgments
VL wishes to thank: J. Völlmer for suggesting the author to look into finite state Markov processes; D. Ruelle and S. Vaienti for reading an earlier version of the manuscript; V. Baladi, G. Froyland, T. Kuna, A. Tantet for many stimulating exchanges and for providing some extremely useful references and hints. VL acknowledges the support of the DFG-funded cluster of excellence CliSAP and of the FP7 ERC StG NAMASTE—Thermodynamics of the Climate System (Grant No. 257106). This paper is dedicated to Alexei Likhtman, a colleague who left us way too soon.
References
- 1.Kubo, R.: Statistical-mechanical theory of irreversible processes. I. General theory and simple applications to magnetic and conduction problems. J. Phys. Soc. Jpn. 12(6), 570–586 (1957)MathSciNetCrossRefADSGoogle Scholar
- 2.Lucarini, V., Colangeli, M.: Beyond the linear fluctuation-dissipation theorem: the role of causality. J. Stat. Mech. 2012(05), P05013 (2012)CrossRefGoogle Scholar
- 3.Ruelle, D.: Differentiation of SRB states. Commun. Math. Phys. 187(1), 227–241 (1997)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 4.Ruelle, D.: Nonequilibrium statistical mechanics near equilibrium: computing higher-order terms. Nonlinearity 11(1), 5–18 (1998)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 5.Ruelle, D.: A review of linear response theory for general differentiable dynamical systems. Nonlinearity 22(4), 855–870 (2009)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 6.Ruelle, D.: Chaotic Evolution and Strange Attractors. Cambridge University Press, Cambridge (1989)zbMATHCrossRefGoogle Scholar
- 7.Lucarini, V., Sarno, S.: A statistical mechanical approach for the computation of the climatic response to general forcings. Nonlinear Process. Geophys. 18, 7–28 (2011)CrossRefADSGoogle Scholar
- 8.Colangeli, M., Lucarini, V.: Elements of a unified framework for response formulae. J. Stat. Mech. Theory E. 2014, P01002 (2014)Google Scholar
- 9.Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in stationary states. J. Stat. Phys. 80(5–6), 931–970 (1995)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 10.Gallavotti, G.: Chaotic hypothesis: Onsager reciprocity and fluctuation-dissipation theorem. J. Stat. Phys. 84(5–6), 899–925 (1996)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 11.Vannitsem, S., Lucarini, V.: Statistical and dynamical properties of covariant Lyapunov vectors in a coupled atmosphere-ocean model—multiscale effects, geometric degeneracy, and error dynamics. ArXiv e-prints, October 2015Google Scholar
- 12.Gaspard, P.: Time-reversed dynamical entropy and irreversibility in markovian random processes. J. Stat. Phys. 117, 599–615 (2004)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 13.Gallavotti, G.: Stationary nonequilibrium statistical mechanics. In: Francoise, J.P., Naber, G.L., Tsun, T.S. (eds.) Encyclopedia of Mathematical Physics, vol. 3, pp. 530–539. Elsevier, Amsterdam (2006)CrossRefGoogle Scholar
- 14.Gallavotti, G.: Nonequilibrium and irreversibility. Springer, New York (2014)zbMATHCrossRefGoogle Scholar
- 15.Baladi, V.: Positive Transfer Operators and Decay of Correlations. World Scientific, Singapore (2000)zbMATHGoogle Scholar
- 16.Butterley, O., Liverani, C.: Smooth Anosov flows: correlation spectra and stability. J. Mod. Dyn. 1(2), 301–322 (2007)zbMATHMathSciNetCrossRefGoogle Scholar
- 17.Liverani, C., Gouëzel, S.: Compact locally maximal hyperbolic sets for smooth maps: fine statistical properties. J. Differ. Geom. 79, 433–477 (2008)zbMATHGoogle Scholar
- 18.Baladi, Viviane: Linear response despite critical points. Nonlinearity 21(6), T81 (2008)zbMATHMathSciNetCrossRefGoogle Scholar
- 19.Baladi, V.: Linear response, or else. ArXiv e-prints, August 2014Google Scholar
- 20.Engel, K.-J., Nagel, R.: One-parameter semigroups for linear evolution equations. Springer, New York (2001)Google Scholar
- 21.Chekroun, M.D., Neelin, D.J., Kondrashov, D., McWilliams, J.C., Ghil, M.: Rough parameter dependence in climate models and the role of ruelle-pollicott resonances. Proc. Natl. Acad. Sci. 111(5), 1684–1690 (2014)CrossRefADSGoogle Scholar
- 22.Tantet, A., Lucarini, V., Lunkeit, F., Dijkstra, H.A.: Crisis of the Chaotic Attractor of a Climate Model: A Transfer Operator Approach. ArXiv e-prints, July 2015Google Scholar
- 23.Hoffman, P.F., Kaufman, A.J., Halverson, G.P., Schrag, D.P.: On the initiation of a snowball earth. Science 281, 1342 (2002)CrossRefADSGoogle Scholar
- 24.Pierrehumbert, R.T., Abbot, D., Voigt, A., Koll, D.: Climate of the neoproterozoic. Annu. Rev. Earth Planet. Sci. 39, 417 (2011)CrossRefADSGoogle Scholar
- 25.Lucarini, V., Fraedrich, K., Lunkeit, F.: Thermodynamic analysis of snowball earth hysteresis experiment: efficiency, entropy production, and irreversibility. Q. J. R. Meteorol. Soc. 136, 2–11 (2010)CrossRefADSGoogle Scholar
- 26.Lucarini, V., Pascale, S., Boschi, V., Kirk, E., Iro, N.: Habitability and multistability in earth-like planets. Astronomische Nachrichten 334(6), 576–588 (2013)CrossRefADSGoogle Scholar
- 27.Intergovernmental Panel on Climate Change [Eds.: T. Stocker et al.]. Climate Change: The Physical Science Basis IPCC Working Group I Contribution to AR5. Cambridge University Press, Cambridge (2013). 2014Google Scholar
- 28.Abramov, R.V., Majda, A.J.: Blended response algorithms for linear fluctuation-dissipation for complex nonlinear dynamical systems. Nonlinearity 20(12), 2793–2821 (2007)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 29.Abramov, R.V., Majda, A.J.: New approximations and tests of linear fluctuation-response for chaotic nonlinear forced-dissipative dynamical systems. J. Nonlinear Sci. 18, 303–341 (2008). doi: 10.1007/s00332-007-9011-9 zbMATHMathSciNetCrossRefADSGoogle Scholar
- 30.Eckmann, J.P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57, 617–656 (1985)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 31.Ginelli, F., Poggi, P., Turchi, A., Chaté, H., Livi, R., Politi, A.: Characterizing dynamics with covariant lyapunov vectors. Phys. Rev. Lett. 99, 130601 (2007)CrossRefADSGoogle Scholar
- 32.Eyink, G.L., Haine, T.W.N., Lea, D.J.: Ruelle’s linear response formula, ensemble adjoint schemes and lvy flights. Nonlinearity 17(5), 1867 (2004)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 33.Wang, Qiqi: Forward and adjoint sensitivity computation of chaotic dynamical systems. J. Comput. Phys. 235, 1–13 (2013)MathSciNetCrossRefADSGoogle Scholar
- 34.Lucarini, V.: Evidence of dispersion relations for the nonlinear response of the Lorenz 63 system. J. Stat. Phys. 134, 381–400 (2009). doi: 10.1007/s10955-008-9675-z zbMATHMathSciNetCrossRefADSGoogle Scholar
- 35.Lucarini, V., Blender, R., Herbert, C., Ragone, F., Pascale, S., Wouters, J.: Mathematical and physical ideas for climate science. Rev. Geophys. 52(4), 809–859 (2014)CrossRefADSGoogle Scholar
- 36.Ragone, F., Lucarini, V., Lunkeit, F.: A new framework for climate sensitivity and prediction: a modelling perspective. Clim. Dyn. 1–13 (2015). doi: 10.1007/s00382-015-2657-3
- 37.Ding, J., Li, T.Y., Zhou, A.: Finite approximations of markov operators. J. Comput. Appl. Math. 147(1), 137–152 (2002)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 38.Tantet, A., van der Burgt, F.R., Dijkstra, H.A.: An early warning indicator for atmospheric blocking events using transfer operators. Chaos 25(3), 036406 (2015)CrossRefADSGoogle Scholar
- 39.Froyland, G.: Approximating physical invariant measures of mixing dynamical systems in higher dimensions. Nonlinear Anal. 32(7), 831–860 (1998)zbMATHMathSciNetCrossRefGoogle Scholar
- 40.Dellnitz, M., Junge, O.: On the approximation of complicated dynamical behavior. SIAM J. Numer. Anal. 36(2), 491–515 (1999)MathSciNetCrossRefGoogle Scholar
- 41.Blank, M., Keller, G., Liverani, C.: Ruelle–Perron–Frobenius spectrum for Anosov maps. Nonlinearity 15(6), 1905 (2002)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 42.Froyland, G.: On Ulam approximation of the isolated spectrum and eigenfunctions of hyperbolic maps. Discr. Contin. Dyn. Syst. 17(3), 671–689 (2007)zbMATHMathSciNetCrossRefGoogle Scholar
- 43.Froyland, G.: Computer-assisted bounds for the rate of decay of correlations. Commun. Math. Phys. 189(1), 237–257 (1997)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 44.Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130–141 (1963)CrossRefADSGoogle Scholar
- 45.Schweitzer, P.J.: Perturbation theory and finite Markov chains. J. Appl. Probab. 5(2), 401–413 (1968)zbMATHMathSciNetCrossRefGoogle Scholar
- 46.Mitrophanov, A.Yu.: Sensitivity and convergence of uniformly ergodic Markov chains. J. Appl. Probab. 42, 1003–1014 (2005)Google Scholar
- 47.Seneta, A.: Explicit forms for ergodicity coefficients and spectrum localization. Linear Algebra Appl. 60, 187–197 (1984)zbMATHMathSciNetCrossRefGoogle Scholar
- 48.Ipsen, I.C.F., Selee, T.M.: Ergodicity coefficients defined by vector norms. SIAM J. Matrix. Anal. Appl. 32(1), 153–200 (2011)zbMATHMathSciNetCrossRefGoogle Scholar
- 49.Seneta, E.: Sensitivity of finite Markov chains under perturbation. Stat. Probab. Lett. 17, 163–168 (1993)zbMATHMathSciNetCrossRefADSGoogle Scholar
- 50.Bódai, T.: Predictability of threshold exceedances in dynamical systems. ArXiv e-prints, August 2014Google Scholar
- 51.Liverani, C., Gouëzel, S.: Banach spaces adapted to Anosov systems. Ergodic Theory Dyn. Syst. 26, 189–217 (2006)zbMATHGoogle Scholar
- 52.Bonatti, C., Diaz, L.J., Viana, M.: Dynamics Beyond Uniform Hyperbolicity: A Global Geometric and Probabilistic Perspective. Springer, New York (2005)Google Scholar
- 53.Tucker, W.: The Lorenz attractor exists. C. R. Acad. Sci. Paris Sér. I Math. 328(12), 1197–1202 (1999)zbMATHCrossRefADSGoogle Scholar
- 54.Reick, C.H.: Linear response of the Lorenz system. Phys. Rev. E 66, 036103 (2002)MathSciNetCrossRefADSGoogle Scholar
- 55.Froyland, G., Padberg, K.: Almost-invariant sets and invariant manifolds—connecting probabilistic and geometric descriptions of coherent structures in flows. Phys. D 238, 1507–1523 (2009)zbMATHMathSciNetCrossRefGoogle Scholar
- 56.Corti, S., Molteni, F., Palmer, T.N.: Signature of recent climate change in frequencies of natural atmospheric circulation regimes. Nature 398(6730), 799–802 (1999)CrossRefADSGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
