Abstract
We consider optimal control problems for discrete-time random dynamical systems, finding unique perturbations that provoke maximal responses of statistical properties of the system. We treat systems whose transfer operator has an \(L^2\) kernel, and we consider the problems of finding (i) the infinitesimal perturbation maximising the expectation of a given observable and (ii) the infinitesimal perturbation maximising the spectral gap, and hence the exponential mixing rate of the system. Our perturbations are either (a) perturbations of the kernel or (b) perturbations of a deterministic map subjected to additive noise. We develop a general setting in which these optimisation problems have a unique solution and construct explicit formulae for the unique optimal perturbations. We apply our results to a Pomeau–Manneville map and an interval exchange map, both subjected to additive noise, to explicitly compute the perturbations provoking maximal responses.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The statistical properties of the long-term behaviour of deterministic or stochastic dynamical systems are strongly related to the properties of invariant or stationary measures and to the spectral properties of the associated transfer operator. When the dynamical system is perturbed it is useful to understand and predict the response of the statistical properties of the system through these objects. When such responses are differentiable, we say that the system exhibits a linear response to the class of perturbations. To first order, this response can be described by a suitable derivative expressing the infinitesimal rate of change in e.g. the natural invariant measure or in the spectrum. Understanding the response of statistical properties to perturbation has particular importance in applications, including to climate science (see e.g. Ghil and Lucarini 2020; Hairer and Majda 2010 and the references therein).
In the present paper we go beyond quantifying responses and address natural problems concerning the optimal response, namely which perturbations elicit a maximal response. For example, given an observation function, which perturbation produces the greatest change in the expectation of this observation, and which perturbation produces the greatest change in the rate of convergence to equilibrium. Continuing the climate science application, one may wish to know which small climate action (which perturbation) would produce the greatest reduction in the average temperature (the expected observation value). We note that by considering trajectories of a perturbed map and using ergodicity, one may view the problem of maximising the response in the expectation of an observation as an infinite-horizon optimal control problem, averaging an observation along trajectories.
The linear response of dynamical systems is an area of intense research and we present a brief overview of the literature that is related to the present work. Early results concerning the response of invariant measures to the perturbation of a deterministic system have been obtained by Ruelle (1997) in the uniformly hyperbolic case. More recently, these results have been extended to several other situations in which one has some hyperbolicity and sufficient regularity of the system and its perturbations. We refer the reader to the survey (Baladi 2014) for an extended discussion of the literature about linear response (and its failure) for deterministic systems.
The mathematical literature on linear response of invariant measures of stochastic or random dynamical systems is more recent. In the framework of continuous-time random processes and stochastic differential equations, linear response results were proved in Hairer and Majda (2010) and Koltai et al. (2019). Results related to the linear response of the stationary measure for diffusion in random media appear in Komorowski and Olla (2005), Gantert et al. (2012), Gantert et al. (2017), Faggionato et al. (2019) and Mathieu and Piatnitski (2018). In the discrete-time case, examples of linear response for small random perturbations of uniformly hyperbolic deterministic systems appeared in Gouëzel and Liverani (2006). In Bahsoun et al. (2020), linear response results are given for random compositions of expanding or non-uniformly expanding maps. In Zmarrou and Homburg (2007) the smoothness of the invariant measure response under suitable perturbations is proved for a class of random diffeomorphisms, but no explicit formula is given for the derivatives; an application to the smoothness of the rotation number of Arnold circle maps with additive noise is presented. Systems generated by the iteration of a deterministic map subjected to i.i.d. additive random perturbations are one class of stochastic systems studied in the present paper (see Sect. 6). The linear response of such systems is considered systematically in Galatolo and Giulietti (2019) and linear response results are proved for perturbations to the deterministic map or to the additive noise. These results are used to by Marangio et al. (2019) to extend some results of Zmarrou and Homburg (2007) outside the diffeomorphism case and applied to an idealised model of El Niño-Southern Oscillation, given by a noninvertible circle map with additive noise. Higher derivative results for the response of systems with additive noise are presented in Galatolo and Sedro (2020). Response results for random systems in the so-called quenched point of view appeared recently in Sedro (2019) and Sedro and Rugh (2020) where the random composition of expanding maps is considered using Hilbert cone techniques and in Dragicevic and Sedro (2020) where the random composition of hyperbolic maps is considered by a transfer operator based approach.
We remark that the addition of random perturbations is not necessarily sufficient to guarantee a linear response. An i.i.d. composition of the identity map and a rotation on the circle is considered in Galatolo (2018), and it is shown that using observables with square-integrable first derivative, one only has Hölder continuity of the response with respect to \(C^0\) perturbations of the circle rotation.
One can similarly consider the linear response of the dominant eigenvalues of the transfer operator under perturbation. In the literature, there are several results describing the way eigenvalues and eigenvectors of suitable classes of operators change when those operators are perturbed in some way, for example classical results concerning compact operators subjected to analytic perturbations (Kato 1995), and quasi-compact Markov operators subjected to \(C^k\) perturbations (Hennion and Hervé 2001). In specific classes of dynamics, differentiability of isolated spectral data is demonstated in Gouëzel and Liverani (2006) for transfer operators of Anosov maps where the map is subjected to \(C^k\) perturbations and in Koltai et al. (2019) for transfer operators arising from SDEs subjected to \(C^k\) perturbations of the drift.
Optimal linear response questions have been considered in the dynamical setting of homogeneous (and inhomogeneous) finite-state Markov chains (Antown et al. 2018), where explicit formulae are provided for the unique maximising perturbations that (i) maximise the norm of the response, (ii) maximise the expectation of a given observable, and (iii) maximise the spectral gap. The efficient Lagrange multiplier approach created in Antown et al. (2018) for questions (ii) and (iii) will be developed for the infinite-dimensional setting of stochastic integral operators in the present paper. In continuous time, Froyland and Santitissadeekorn (2017) maximised the spectral gap of a numerical discretisation of a periodically forced Fokker-Planck equation (perturbing the velocity field to maximally speed up or slow down the exponential mixing rate). The same problem is considered by Froyland et al. (2020), but for general aperiodic forcing over a finite time, using the Lagrange multiplier approach of Antown et al. (2018). A non-spectral approach to increasing mixing rates by optimal kernel perturbations in discrete time is Froyland et al. (2016).
Related optimal control problems have been considered in Galatolo and Pollicott (2017) where the goal was to find a minimal perturbation realising a specific response to the invariant measure of a deterministic system (about the problem of finding an infinitesimal perturbation realising a given response see also Kloeckner 2018). These kinds of questions and other similar ones were also briefly considered in Galatolo and Giulietti (2019) for random dynamical systems consisting of deterministic maps perturbed by additive noise. Similar problems in the case of probabilistic cellular automata were considered in MacKay (2018).
The present work takes the point of view of Antown et al. (2018), but seeks to treat stochastic dynamical systems on smooth domains, instead of Markov chains on domains consisting of a finite number of states. We prove the existence of unique optimal perturbations, derive explicit formulae for these optimal perturbations, and illustrate the formulae and their conclusions via two topical examples. The move from stochastic matrices in Antown et al. (2018) to stochastic integral operators creates considerable additional technical challenges for the existence of the linear responses, as well as for posing and solving the infinite-dimensional optimisation problems that now arise. We consider the class of stochastic dynamical systems with transfer operators representable by an \(L^2\)-compact, integral operator, which includes deterministic systems perturbed by additive noise. The transfer operator L has the form
where k is a stochastic kernel; in the case of deterministic systems T with additive noise, \(k(x,y)=\rho (x-T(y))\), with \(\rho \) a probability density representing the distribution of the noise intensity (see Sect. 6). We consider perturbations of two types: firstly, perturbations to the kernel k, and secondly, perturbations to the map T.
An outline of the paper is as follows. In Sect. 2 we consider general compact, integral-preserving operators \(L:L^2 \rightarrow L^2\) (see (3)) and state general linear response statements for the normalised fixed points and the leading eigenvalues of these operators (Theorem 2.2 and Proposition 2.6). In Sect. 3, we derive response formulae for the normalised fixed points (Corollary 3.5) and spectral values (Corollary 3.6) of operators of the form (1), under perturbation of the kernel k. In Sect. 4 we consider the problem of finding the perturbation that provokes a maximal response in the average of a given observable (General Problem 1) and the spectral gap (General Problem 2). We show that if the feasible set of perturbations is convex, an optimal solution exists, and that this optimum is unique if the feasible set is strictly convex. In Sect. 5.1, using Lagrange multipliers we derive an explicit formula for the unique optimal kernel perturbation that maximises the expectation of an observable (Theorem 5.4). In Sect. 5.2 we prove an explicit formula for the perturbation that maximise the change in spectral gap (and therefore the rate of mixing) of the system (Theorem 5.6).
In Sect. 6, we specialise our integral operators to annealed transfer operators corresponding to deterministic maps T with additive noise. For these systems, the kernel k has the form \(k(x,y)=\rho (x-T(y))\) for some nonsingular transformation T, and we consider perturbations of the map T directly. Response formulas for these perturbations are developed in Proposition 6.3 and Proposition 6.6 for the invariant measure and the dominating eigenvalues, respectively. In this framework we again prove existence and uniqueness of the map perturbation maximising the derivative of the expectation of an observation (Proposition 7.3) and then derive an explicit formula for the extremiser (Theorem 7.4). Proposition 7.6 and Theorem 7.7 state results analogous to Proposition 7.3 and Theorem 7.4 for the optimisation of the spectral gap and mixing rate.
In Sect. 8 we apply and illustrate the theoretical findings of this work on the Pomeau–Manneville map and a weakly mixing interval exchange, each perturbed by additive noise. For each map we numerically estimate (i) the optimal stochastic perturbation (perturbing the kernel k) and (ii) the optimal deterministic perturbation (perturbing the map T) that maximises the derivatives of the expectation of an observable and the mixing rate. One of the interesting lessons is that to maximally increase the mixing rate of the noisy Pomeau–Manneville map, one should perturb the kernel (stochastic perturbation) to move mass away from the indifferent fixed point or deform the map to transport mass away from the fixed point (deterministic perturbation); see Figs. 4 and 7, respectively. Further numerical outcomes are discussed and explained in Sect. 8.
2 Linear Response for Compact Integral-Preserving Operators
In this section, we introduce general response results for integral-preserving compact operators. We consider both the response of the invariant function to the perturbations and the response of the dominant eigenvalues.
2.1 Existence of Linear Response for the Invariant Function
In the following, we consider integral-preserving compact operators acting on \(L^{2}\), which are not necessarily positive. We will give a general linear response statement for their invariant functions. In Sect. 3 we show how these results can be applied to Hilbert–Schmidt integral operators, which will later be transfer operators of suitable random dynamical systems.
Let \(L^{2}([0,1])\) be the space of square-integrable functions over the unit interval (considered with the Lebesgue measure m); for brevity, we will denote it as simply \(L^{2}\). We remark that the analysis in the rest of the paper can be extended to manifolds, but we keep the setting simple so as not to obscure the main ideas.
Let us consider the space of zero-average functions
Definition 2.1
We say that an operator \(L:L^{2}\rightarrow L^{2}\) has exponential contraction of the zero average space V if there are \(C\ge 0\) and \( \lambda <0\) such that \(\forall g\in V\)
for all \(n\ge 0\).
For \({\bar{\delta }}>0\) and \(\delta \in [0,{\bar{\delta }})\), we consider a family of integral-preserving, compact operators \(L_{\delta }:L^{2}\rightarrow L^{2}\); we think of \(L_{\delta }\) as perturbations of \(L_{0}\). We say that \(f_\delta \in L^2\) is an invariant function of \(L_\delta \) if \(L_\delta f_\delta =f_\delta \). We will see that under natural assumptions, the operators \(L_\delta \), \(\delta \in [0,{\bar{\delta }})\), have a family of normalised invariant functions \(f_{\delta } \in L^{2}\). Furthermore, for suitable perturbations the invariant functions vary smoothly in \(L^{2}\) and we get an explicit formula for the resulting derivative \(\frac{df_\delta }{d \delta }\). We remark that since the operators we consider are not necessarily positive, the invariant functions are not necessarily positive.
Theorem 2.2
(Linear response for integral-preserving compact operators) Let us consider a family of compact operators \(L_{\delta }:L^{2}\rightarrow L^{2}\), with \(\delta \in \left[ 0,{\overline{\delta }}\right) \), preserving the integral: for each \(g\in L^2\)
Then,
-
(I)
The operators have invariant functions in \(L^2\): for each \(\delta \in [0,{\bar{\delta }})\) there is \(g_\delta \ne 0\) such that \(L_\delta g_\delta =g_\delta \).
-
(II)
Suppose \(L_0\) also satisfies the following:
-
(A1)
(mixing of the unperturbed operator) For every \(g\in V\),
$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert L_{0}^{n}g\Vert _{2}=0. \end{aligned}$$Under this assumption, the unperturbed operator \(L_0 \) has a unique normalised invariant function \(f_0\) such that \(\int {f}_0\ \mathrm{d}m=1\). Furthermore, \(L_0\) has exponential contraction of the zero average space V.
-
(A1)
-
(III)
Suppose the family of operators \(L_\delta \) also satisfy the following:
-
(A2)
(\(L_\delta \) are small perturbations and existence of derivative operator at \(f_0\)) Suppose there is a \( K\ge 0\) such that \(\left| |L_{\delta }-L_{0}|\right| _{L^{2}\rightarrow L^{2}}\le K\delta \) for small \(\delta \). Furthermore, suppose there exist \({\hat{f}} \in V\) such that
$$\begin{aligned} \underset{\delta \rightarrow 0}{\lim } \frac{(L_{\delta }-L_{0})}{ \delta }f_0 ={\hat{f}}. \end{aligned}$$
Under these assumptions, the following hold:
-
(a)
There exists a \(\delta _2>0\) such that for each \(0\le \delta <\delta _2 \), the operators \(L_\delta \) have unique invariant functions \({f}_\delta \) such that \(\int {f}_\delta \ \mathrm{d}m=1.\) Furthermore, \(L_\delta \) has exponential contraction on V for \(0<\delta <\delta _2\).
-
(b)
The resolvent operator \(({Id}-L_0)^{-1}:V\rightarrow \) V is continuous.
-
(c)
\(\qquad \qquad \quad \displaystyle \lim _{\delta \rightarrow 0}\left\| \frac{f_{\delta }-f_{0}}{\delta }-({Id} -L_0)^{-1}{\hat{f}}\right\| _{2}=0;\) thus, \(({Id}-L_0)^{-1}{\hat{f}}\) represents the first-order term in the perturbation of the invariant function for the family of systems \(L_{\delta }\).
-
(A2)
Proof
Claim (I): We start by proving the existence of the invariant functions \(g_\delta \) for the operators \(L_\delta \). Since the operators are compact and integral preserving, \(L_\delta \) has an eigenvalue 1 for each \(\delta \). Indeed, let us consider the adjoint operators \(L^*_\delta :L^2\rightarrow L^2\) defined by the duality relation \(\langle L_\delta f,g\rangle =\langle f,L^*_\delta g \rangle \) for all \(f,g\in L^2.\) Because of the integral-preserving assumption, we have \(\langle f, L^*_\delta {\mathbf {1}}\rangle = \langle L_\delta f,{\mathbf {1}}\rangle = \int L_\delta f\ \mathrm{d}m = \int f\ \mathrm{d}m = \langle f,{\mathbf {1}}\rangle \).Footnote 1 This implies \(L^*_\delta {\mathbf {1}}={\mathbf {1}}\) and thus, 1 is in the spectrum of \(L^*_\delta \) and \(L_\delta \). Since \(L_\delta \) is compact, its spectrum equals the eigenvalues and we have nontrivial fixed points for the operators \(L_\delta \).
Claim (III)(a) for \(\delta =0\): Now we prove the uniqueness of the normalised invariant function of \(L_0\). Above we proved that \(L_0 \) has some invariant function \(g_0\ne 0\). The mixing assumption (A1) implies that \(\int g_{0}\ \mathrm{d}m\ne 0\); to see this, we note that if \(\int g_{0}\ \mathrm{d}m=0\), then \(g_0\in V\), and, by (A1), \(g_0\) cannot be a nontrivial fixed point of \(L_0\). We claim that \(f_0=\frac{g_0}{\int g_{0}\ \mathrm{d}}\) is the unique normalised invariant function for \(L_0\). To see this, suppose there was a second normalised invariant function \(f'_0\); then, \(f'_0-f_0\) would be an invariant function in V, which is a contradiction.
Claim (II): To show that \(L_0\) has exponential contraction on V, we first note that for \(f\in L^2\), we can write \(f=f_0\int f\ \mathrm{d}m+[f-f_0\int f\ \mathrm{d}m]\). Since \([f-f_0\int f\ \mathrm{d}m]\in V\), it follows from (A1) that \(L_0^n f\rightarrow _{L^2} f_0\int f\ \mathrm{d}m\). Thus, the spectrum of \(L_0\) is contained in the unit disk by the spectral radius theorem. Now suppose \(\lambda \) is in the spectrum of \(L_0\) and \(|\lambda |=1\). By the compactness assumption, there is an eigenvector \(f_{\lambda }\) for \(\lambda \) and then we have \(||L_0^n(f_{\lambda })||_2=||f_{\lambda }||_2\). However, \(L_0^n(f_{\lambda })\rightarrow _{L^2} f_0\int f_\lambda \ \mathrm{d}m\), which is not possible unless \(\lambda =1\). Hence, the spectrum of \(L_0|_V\) is strictly contained in the unit disk. Thus, by the spectral radius theorem, there is an \(n>0\) such that \(||L_0^n|_V||_{L^2\rightarrow L^2}\le \frac{1}{2}\) and we have exponential contraction of \(L_0\) on V.
Claim (III)(a) for \(\delta \in [0,{\bar{\delta }}]\): From the assumptions we have \(||L_\delta -L_0||_{L^2\rightarrow L^2}\le K\delta \), and by Part (II) there is an n such that \(||L_0^n|_V||_{L^2\rightarrow L^2}\le \frac{1}{2}\). These facts imply that for small enough \(\delta \) one has \(||L_\delta ^n|_V||_{L^2\rightarrow L^2}\le \frac{2}{3}\) and therefore, \(L_\delta \) is exponentially contracting (and also mixing).
We can apply the argument in Part (II) to the operators \(L_\delta \) and obtain, for each small enough \(\delta \), a unique normalised invariant function \(f_\delta \).
Claim (III)(b): Using the exponential contraction of \(L_0\) on V, we now show that \((\text {Id}-L_{0})^{-1}:V\rightarrow V\) is continuous. Indeed, for \(f\in V\), we get \((\text {Id}-L_0)^{-1}f=f+\sum _{n=1}^{\infty }L_{0}^{n}f\). Since \(L_{0}\) is exponentially contracting on V, and \(\sum _{n=1}^{\infty }Ce^{\lambda n}:=M<\infty ,\) the sum \(\sum _{n=1}^{\infty }L_{0}^{n}f\) converges in V with respect to the \(L^{2}\) norm. The resolvent \((\text {Id}-L_{0})^{-1}:V\rightarrow V\) is then a continuous operator and \(||(\text {Id}-L_{0})^{-1}||_{V\rightarrow V}\le 1+M.\) We remark that since \({\hat{f}}\in V,\) the resolvent can be computed at \({\hat{f}}\).
Claim (III)(c): Now we are ready to prove the linear response formula. Furthermore, we have
from which we obtain \(\Vert f_{\delta }-f_{0}\Vert _{2}\le 3 \Vert L_{\delta }^{n}-L_{0}^{n}\Vert _{L^2\rightarrow L^2}\Vert f_{0}\Vert _{2}\). Since \(\Vert L_\delta -L_0\Vert _{L^2\rightarrow L^2}\le K\delta \) and \(\Vert L_{\delta }^{n}-L_{0}^{n}\Vert _{L^2\rightarrow L^2}\le \sum _{i=1}^n \Vert L_\delta ^{n-i}(L_\delta -L_0)L_0^{i-1}\Vert _{L^2\rightarrow L^2}\), we see that \(\Vert L_\delta ^n-L_0^n \Vert _{L^2\rightarrow L^2}\rightarrow 0 \) as \(\delta \rightarrow 0\) and thus \(\Vert f_{\delta }-f_{0}\Vert _{2} \rightarrow 0\) as \(\delta \rightarrow 0.\)
Since \(f_{0}\) and \(f_{\delta }\) are the invariant functions of \(L_0\) and \(L_\delta \), we have
By applying the resolvent to both sides we obtain
Moreover, from assumption (A2), we have for sufficiently small \(\delta \) that
Since we already proved that \(\lim _{\delta \rightarrow 0}\Vert f_{\delta }-f_{0}\Vert _{2}=0\), we are left with
converging in the \(L^{2}\) norm. \(\square \)
We remark that the strategy of proof of Theorem 2.2 is similar to the one of Theorem 3 of Galatolo and Giulietti (2019) although the assumptions made are quite different, here we consider a compact integral preserving operator on \(L^2\), while in Galatolo and Giulietti (2019) several norms are considered to allow low regularity perturbations and the operator is required to be positive.
It is worth to remark that the above proof gives a description of the spectral picture of \(L_0\). By Theorem 2.2, if \(L_0\) satisfies (A1) then the invariant function is unique, up to normalisation; this shows that 1 is a simple eigenvalue. Furthermore, \(L_0\) preserves the direct sum \(L^2=\) span\(\{f_0\} \oplus V\) and the spectrum of \(L_0\) is strictly inside the unit disk when \(L_0\) is restricted to V. Hence, the spectrum of \(L_0\) is contained in the unit disk and there is a spectral gap.
Remark 2.3
The mixing assumption in (A1) is required only for the unperturbed operator \(L_{0}\). This assumption is satisfied, for example, if \(L_0\) is an integral operator and an iterate of this operator has a strictly positive kernel, see Corollary 5.7.1 of Lasota and Mackey (1985). Later in Remark 6.4 we show this assumption is verified for a wide range of examples of stochastic dynamical systems.
2.2 Existence of Linear Response of the Dominant Eigenvalues
In this section, we consider the existence of linear response for the second largest eigenvalues (in magnitude) and provide a formula for the linear response. An important object needed to quantify linear response statements is a “derivative” of the operator \(L_\delta \) with respect to the perturbation.
Definition 2.4
We define \({\dot{L}}:L^2\rightarrow V\) as the unique linear operator satisfying
Let \({\mathcal {B}}(L^2)\) denote the space of bounded linear operators from the Banach space \(L^2\) to itself and r(L) denote the spectral radius of an operator L; we begin with the following definition.
Definition 2.5
(Hennion and Hervé 2001, Definition III.7) Let \(s\in {\mathbb {N}}, s\ge 1\). We say that \(L\in {\mathcal {B}}(L^2([0,1],{\mathbb {C}}))\) has s dominating simple eigenvalues if there exists closed subspaces E and \({\tilde{E}}\) such that
-
1.
\(L^2([0,1],{\mathbb {C}}) = E\oplus {\tilde{E}}\),
-
2.
\(L(E)\subset E\), \(L({\tilde{E}})\subset {\tilde{E}}\),
-
3.
dim\((E)=s\) and \(L|_{E}\) has s geometrically simple eigenvalues \(\lambda _i\), \( i=1,\dots , s\),
-
4.
\(r(L|_{{\tilde{E}}})<\min \{|\lambda _i|:i=1,\dots ,s\}\).
Adapting Theorem III.8 and Corollary III.11 of Hennion and Hervé (2001) to our situation, we can now state a linear response result for these eigenvalues.
Proposition 2.6
Let \(L_\delta :L^2([0,1],{{\mathbb {C}}} )\rightarrow L^2([0,1],{{\mathbb {C}}})\), where \(\delta \in [0,{{\bar{\delta }}})=:I_0\), be integral-preserving (see equation (3)) compact operators. Assume that the map \(\delta \mapsto L_\delta \) is in \(C^1(I_0,{\mathcal {B}}(L^2([0,1],{{\mathbb {C}}})))\) and \(L_0\) is mixing (see (A1) in Theorem 2.2). Then, \(\lambda _{1,0}:= 1\in \sigma (L_0)\) and \(r(L_0)=1\). Let \({\mathcal {I}}\subset \sigma (L_0)\setminus \{1\}\) be the eigenvalue(s) of maximal modulus strictly inside the unit disk; assume they are geometrically simple and let \(s:=|{\mathcal {I}}|+1\). Then there exists an interval \(I_1:=[0,\delta _1) \), \(I_1\subset I_0\) such that for \(\delta \in I_1\), \(L_\delta \) has s dominating simple eigenvalues. Thus, there exists functions \(e_{i, (\cdot )},\ {\hat{e}}_{i,(\cdot )}\in C^1(I_1,L^2([0,1],{{\mathbb {C}}}))\) and \(\lambda _{i,(\cdot )}\in C^1(I_1,{{\mathbb {C}}})\) such that for \(\delta \in I_1\) and \(i,j = 2,\dots , s\)
-
(i)
\(L_\delta e_{i,\delta } = \lambda _{i,\delta } e_{i,\delta }\), \(L^*_\delta {\hat{e}}_{i,\delta } = \lambda _{i,\delta }{\hat{e}}_{i,\delta }\),
-
(ii)
\(\langle e_{i,\delta },{\hat{e}}_{j,\delta }\rangle _{L^2([0,1],{{\mathbb {C}}})} = \delta _{i,j}\), where \(\delta _{i,j}\) is the Kronecker delta.
Furthermore, let \({\dot{\lambda }}_i\in {{\mathbb {C}}}\) satisfy
then
where \({\dot{L}}\) is as in Definition 2.4.
Proof
From Theorem 2.2 and the discussion following it, \(1\in \sigma (L_0)\) and \(r(L_0)=1\).
We now use Theorem III.8 in Hennion and Hervé (2001) to obtain the existence of linear response and Corollary III.11 (Hennion and Hervé 2001) to obtain the formula. We begin by verifying the two hypotheses of Theorem III.8 (Hennion and Hervé 2001). We remark that our map \(\delta \mapsto L_\delta \) belonging to \(C^1([0,{{\bar{\delta }}}),{\mathcal {B}}(L^2([0,1],{{\mathbb {C}}})))\) can be extended to a map \(C^1((-{{\bar{\delta }}},{{\bar{\delta }}}),{\mathcal {B}}(L^2([0,1],{{\mathbb {C}}})))\).
Doing so, hypothesis (H1) of Theorem III.8 (Hennion and Hervé 2001) is satisfied. Since \(r(L_0)=1\), we just need to show that \(L_0\) has s dominating eigenvalues. Since \(L_0 \) is a compact operator, the eigenvalues \(\lambda _{i,0}\in {\mathcal {I}}\) are isolated. Let \(\Pi _i\) be the eigenprojection onto the eigenspace of \(\lambda _{i,0}\) and \(E_i:=\Pi _i (L^2([0,1],{{\mathbb {C}}}))\). Define the eigenspaces \(E:=\bigoplus _{i=1}^s E_i\) and \({\widetilde{E}}: = (\text {Id} -\sum _{i=1}^s\Pi _i)(L^2([0,1],{{\mathbb {C}}}))\). We thus have:
-
(1)
\(L^2([0,1],{{\mathbb {C}}}) = E\oplus {\widetilde{E}}\).
-
(2)
\(L_0\left( E\right) \subset E\) and \(L_0({\widetilde{E}})\subset {\widetilde{E}}\).
-
(3)
dim\(\left( E\right) =s\) and \(L_0|_{E}\) has s simple eigenvalues \(\lambda _{1,0}\cup {\mathcal {I}}\). This point follows from the assumption that the eigenvalues in \({\mathcal {I}}\) are geometrically simple and the fact that \(\lambda _{1,0}\) is simple (see Theorem 2.2).
-
(4)
\(r(L_0|_{{\widetilde{E}}}) <|\lambda _{i,0}|\) where \(\lambda _{i,0}\in {\mathcal {I}}\).
Thus, \(L_0\) satisfies hypothesis (H2) of Theorem III.8 since it has s dominating simple eigenvalues and \(r(L_0)=1\). Hence, from Theorem III.8 (Hennion and Hervé 2001), the map \( \delta \mapsto \lambda _{i,\delta }\) is differentiable at \(\delta =0\).
We can now apply the argument in Corollary III.11 (Hennion and Hervé 2001) for \(\lambda _{i,0}\) to obtain (15) (the result and proof of Corollary III.11 (Hennion and Hervé 2001) is for the top eigenvalue, however the argument still holds for any eigenvalue \(\lambda _{i,0}\), \(\in {\mathcal {I}}\) by changing the index value in the proof of the corollary). \(\square \)
3 Application to Hilbert–Schmidt Integral Operators
In this section, we apply the results of the previous section to Hilbert–Schmidt integral operators and suitable perturbations. The operators we consider are compact operators on \(L^{2}([0,1],{{\mathbb {R}}})\) (or \(L^{2}([0,1],{{\mathbb {C}}})\)); for brevity we will denoteFootnote 2\(L^{2}:=L^{2}([0,1],{{\mathbb {R}}})\). To avoid confusion we point out that in the following we will also consider the space \(L^{2}([0,1]^{2})\) of square integrable real functions on the unit square; this space contains the kernels of the operators we consider.
Let \(k\in L^{2}([0,1]^{2})\) and consider the operator \(L:L^2 \rightarrow L^2\) defined in the following way: for \(f\in L^{2}\)
such an operator is called a Hilbert–Schmidt integral operator. Such operators may represent the annealed transfer operators of systems perturbed by additive noise (see Sect. 6).
We now list some well-known and basic facts about Hilbert–Schmidt integral operators with kernels in \(L^{2}([0,1]^{2})\):
-
The operator \(L:L^{2}\rightarrow L^{2}\) is bounded and
$$\begin{aligned} ||Lf||_{2}\le ||k||_{L^{2}([0,1]^{2})}||f||_{2} \end{aligned}$$(6)(see Proposition 4.7 in II.§4 Conway 2013).
-
If \(k\in L^{\infty }([0,1]^{2})\), then
$$\begin{aligned} ||Lf||_{\infty }\le ||k||_{L^{\infty }([0,1]^{2})}||f||_{1} \end{aligned}$$(7)and the operator \(L:L^1\rightarrow L^{\infty }\) is bounded. Furthermore, \(\Vert L\Vert _{L^p\rightarrow L^\infty }\le \Vert k\Vert _{L^\infty ([0,1]^2)}\) for \(1\le p\le \infty \).
-
If for almost every \(y\in [0,1]\) we have
$$\begin{aligned} \int k(x,y) \mathrm{d}x=1, \end{aligned}$$then the Hilbert–Schmidt integral operator associated to the kernel k is integral preserving (satisfies (3)).
-
The operator \(L:L^2\rightarrow L^2\) is compact (see Kolmogorov and Fomin 1961).
Combining the last two points, we have from Theorem 2.2 that such an operator has an invariant function in \(L^{2}\). Furthermore, for \(k\in L^\infty ([0,1]^2)\) we have an analogous result.
Lemma 3.1
Let \(L:L^2\rightarrow L^2\) be an integral operator, with integral-preserving kernel \(k\in L^\infty ([0,1]^2)\), that is mixing (satisfies (A1) of Theorem 2.2). Then, there exists a unique fixed point \(f\in L^\infty \) of L satisfying \(\int f\ \mathrm{d}m =1\). Furthermore, if the kernel is nonnegative, then f is nonnegative.
Proof
Since k is an integral-preserving kernel, \(L_0\) satisfies (3). Thus, we can apply Theorem 2.2 to conclude that there exists a unique \(f\in L^2\), \(\int f\ \mathrm{d}m=1\), such that \(Lf=f\). Noting that \(k\in L^\infty ([0,1]^2)\), we have from inequality (7) that \(f\in L^\infty \).
We now assume k is nonnegative. Let \(k^j\) be the kernel of the operator \(L^{j}\). Since k is an integral-preserving kernel, we have
it easily follows that \(\Vert k^j\Vert _{L^\infty ([0,1]^2)}\le \Vert k\Vert _{L^\infty ([0,1]^2)}\). Thus, for any probability density \(g\in L^1\), we have \(\Vert L^jg\Vert _\infty \le \Vert k\Vert _{L^\infty ([0,1]^2)}\); thus, by Corollary 5.2.2 in Lasota and Mackey (1985), there exists a probability density \({\hat{f}}\in L^1\) such that \(L{\hat{f}} = {\hat{f}}\). Since f is the unique invariant function with integral 1, we have \({\hat{f}}=f\); thus, f is a probability density. \(\square \)
3.1 Characterising Valid Perturbations and the Derivative Operator
In this subsection we consider perturbations of integral-preserving Hilbert–Schmidt integral operators such that assumption (A2) of Theorem 2.2 can be verified and the derivative operator \( {\dot{L}}\) computed. We begin, however, by first characterising the set of perturbations for which the integral preserving property of the operators is preserved.
Consider the set \(V_{\ker }\) of kernels having zero average in the x direction, defined as
Lemma 3.2
Consider a kernel operator \(A:L^{2}([0,1]) \rightarrow L^{2}([0,1])\) defined by \(Af(x)=\int k(x,y)f(y)\mathrm{d}y\). Then, the following are equivalent
-
1.
\(A(L^{2}([0,1]))\subseteq V\),
-
2.
\(k\in V_{\ker }\).
Proof
Clearly, the second condition implies the first. For the other direction we prove the contrapositive. If \(\int k(x,y)\mathrm{d}x\ne 0\) on a set of positive measure, then for a small \(\epsilon >0\) there is a set S of positive measure \(m(S) >0\) such that \(\int k(x,y)\mathrm{d}x\ge \epsilon \) or \(\int k(x,y)\mathrm{d}x\le -\epsilon \) for each \(y\in S\). Suppose \(\int k(x,y)\mathrm{d}x\ge \epsilon \) in this set, consider \(f:={\mathbf {1}}_{S}\) and \( g:=Af.\) Then, \(g(x)=\int k(x,y){\mathbf {1}}_{S}(y)\mathrm{d}y\) and we have \(\int g(x)\mathrm{d}x= \int _S \int k(x,y) \mathrm{d}x \mathrm{d}y \ge \epsilon \ m(S) \) and \( g\notin V\). The other case \(\int k(x,y)\mathrm{d}x\le -\epsilon \) is analogous. \(\square \)
We now prove that \(V_{\ker }\) is closed.
Lemma 3.3
The set \(V_{\ker }\) is a closed vector subspace of \( L^{2}([0,1]^{2}).\)
Proof
The fact that \(V_{\ker }\) is a vector space is trivial. For fixed \(f\in L^{2}([0,1])\), the set of \(k\in L^2([0,1]^2)\) such that \(\int k(x,y)f(y)\mathrm{d}x\in V\) is closed. To see this, define the function \(K_{f}:L^{2}([0,1]^{2})\rightarrow L^{2}([0,1])\) as
By (6), \(K_{f}\) is continuous. Since V is closed in \(L^{2}([0,1])\), this implies that \(K_{f}^{-1}(V)\) is closed in \(L^{2}([0,1]^{2}).\) Finally, \(V_{\ker }\) is closed in \(L^2([0,1]^2)\) because \(V_{\ker }=\cap _{f\in L^{2}([0,1])}K_{f}^{-1}(V)\). \(\square \)
We now introduce the type of perturbations which we will investigate throughout the paper. Let \(L_{\delta }:L^{2}\rightarrow L^{2}\) be a family of integral operators, with kernels \(k_{\delta }\in L^{2}([0,1]^{2})\), given by
Lemma 3.4
Let \(k_{\delta }\in L^{2}([0,1]^{2})\) for each \(\delta \in [0,{\bar{\delta }}).\) Suppose that
where \({\dot{k}},\ r_\delta \in L^{2}([0,1]^{2})\) and \( ||r_\delta ||_{L^{2}([0,1]^{2})} = o(\delta ).\) The bounded linear operator \({\dot{L}}:L^2\rightarrow V\) defined by
satisfies
If additionally the derivative of the map \(\delta \mapsto k_\delta \) with respect to \(\delta \) varies continuously in a neighborhood of \(\delta =0\), then \(\delta \mapsto L_\delta \) has a continuous derivative in a neighborhood of \(\delta =0\).
Proof
By integral preservation of \(L_\delta \) and the fact that \({\dot{k}}\in L^2([0,1]^2)\), one sees that \({\dot{L}}:L^2\rightarrow V\) and is bounded. By (9),
Proceeding similarly, one shows that if the map \(\delta \mapsto k_\delta \) has a continuous derivative with respect to \(\delta \) in a neighborhood of \(\delta =0\), then \(\delta \mapsto L_\delta \) has a continuous derivative. Indeed we are supposing that for each \(\delta \in [0,{\overline{\delta }})\) there is \({\dot{k}}_\delta \) such that for small enough h
where \({\dot{k}}_\delta ,\ r_{\delta ,h} \in L^{2}([0,1]^{2})\), \( ||r_{\delta ,h} ||_{L^{2}([0,1]^{2})} = o(h)\) and furthermore \(\delta \mapsto {\dot{k}}_\delta \) is continuous. We have then by (6) that the associated operators \({\dot{L}}_\delta \) defined as
also varies in a continuous way as \(\delta \) increases. \(\square \)
3.2 A Formula for the Linear Response of the Invariant Function and Its Continuity
Now we apply Theorem 2.2 to Hilbert–Schmidt integral operators to obtain a linear response formula for \(L^2\) perturbations.
Corollary 3.5
(Linear response formula for kernel operators) Suppose \(L_{\delta }:L^2\rightarrow L^2\) are integral-preserving (satisfying (3)) integral operators with stochastic kernels \(k_{\delta }\in L^2([0,1]^{2})\) as in (9). Suppose \(L_0\) satisfies assumption (A1) of Theorem 2.2. Then \({\dot{k}}\in V_{\ker }\), the system has linear response for this perturbation and an explicit formula for it is given by
with convergence in \(L^{2}.\)
Proof
Since \(L_\delta \), \(\delta \in [0,{{\bar{\delta }}})\), is integral preserving, we have \((L_\delta -L_0)(L^2)\subset V\) and therefore, \(k_\delta -k_0\in V_{\ker }\) by Lemma 3.2, i.e. \(\delta \cdot {\dot{k}}+r_\delta \in V_{\ker }\). Then \({\dot{k}}+\frac{r_\delta }{\delta }\in V_{\ker }\) for each \(\delta \). Since \(\frac{r_\delta }{\delta }\rightarrow 0\) in \(L^2\) and \(V_{\ker }\) is a closed subspace we have \({\dot{k}}\in V_{\ker }\). Furthermore by (9) there is a \(K\ge 0\) such that
Hence the family of operators satisfy the first part of assumption (A2). The second part of this assumption is established by the first result of Lemma 3.4.
Since the operators \(L_\delta \) are compact, integral preserving, and satisfy assumptions (A1) and (A2) we can conclude by applying Theorem 2.2 to this family of operators, obtaining
\(\square \)
Now we show that the linear response of the invariant function is continuous with respect to the kernel perturbation. This will be used in Sect. 4 for the proof of the existence of solutions of our main optimisation problems.
Consider the operator \(L_{0}\), having a kernel \(k_{0}\in L^{2}([0,1]^2)\), and a set of infinitesimal perturbations \(P\subset V_{\ker }\) of \(k_{0}\). We will endow P with the topology induced by its inclusion in \(L^2([0,1]^2)\). Suppose \(L_{\delta }\) is a perturbation of \(L_0\) satisfying the assumptions of Lemma 3.4. By Corollary 3.5, the linear response will depend on the first-order term of the perturbation, \(\dot{k} \in P\), allowing us to define the function \(R:P\rightarrow V\) by
By (6) and the continuity of the resolvent operator it follows directly that the response function \(R:(P,\Vert \cdot \Vert _{L^2([0,1]^2)})\rightarrow (V,\Vert \cdot \Vert _{L^2})\) is continuous.
3.3 A Formula for the Linear Response of the Dominant Eigenvalues and Its Continuity
We apply Proposition 2.6 to Hilbert–Schmidt integral operators and obtain a linear response formula for the dominant eigenvalues in the case of \(L^2\) perturbations. Denote by \(\Re (\cdot )\) and \(\Im (\cdot )\) the functions that return the real and imaginary parts of complex arguments.
Corollary 3.6
Suppose \(L_{\delta }:L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\) are integral-preserving (satisfying (3)) integral operators with kernels \(k_{\delta }\in L^2([0,1]^{2})\) satisfying \(\delta \mapsto k_\delta \in C^1([0,{\bar{\delta }}),L^2([0,1]^{2}))\). Suppose \(L_0\) satisfies (A1) of Theorem 2.2. Let \(\lambda _0\in {{\mathbb {C}}}\) be an eigenvalue of \(L_0\) with the largest magnitude strictly inside the unit circle and assume that \(\lambda _0\) is geometrically simple. Then, there exists \({\dot{\lambda }}\in {\mathbb {C}}\) such that
Furthermore,
where \(e\in L^2([0,1],{\mathbb {C}})\) is the eigenvector of \(L_0\) associated to the eigenvalue \(\lambda _0\), \({\hat{e}}\in L^2([0,1],{\mathbb {C}})\) is the eigenvector of \(L_0^*\) associated to the eigenvalue \(\lambda _{0}\) and \({\dot{L}}\) is the operator in Lemma 3.4.
Proof
Since \(k_\delta \in L^2([0,1]^2)\), the operator \(L_\delta :L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\) is compact; by assumption, it also satisfies (3). From Lemma 3.4, the map \(\delta \mapsto L_\delta \) is \(C^1\). Hence, by Proposition 2.6, we have \({\dot{\lambda }} = \langle {\hat{e}},{\dot{L}} e \rangle _{L^2([0,1],{\mathbb {C}})}\). Finally, we compute
\(\square \)
From the expression in the final line of the proof above, it is clear that if we consider \({\dot{\lambda }}\) as a function of \({\dot{k}}\), the map \({\dot{\lambda }}:(V_{\ker },\Vert \cdot \Vert )_{L^2([0,1]^2)}\rightarrow {\mathbb {C}}\) is continuous.
4 Optimal Response: Optimising the Expectation of Observables and Mixing Rate
Having described the responses of our dynamical systems to perturbations, it is natural to consider the optimisation problem of finding perturbations that provoke maximal responses. We consider the problems of finding the infinitesimal perturbation that maximises the expectation of a given observable and the infinitesimal perturbation that maximally enhances mixing. In doing so, we extend the approach in Antown et al. (2018) from the setting of finite-state Markov chains to the integral operators considered in the present paper. We are now in the realm of infinite-dimensional optimisation, which is considerably more challenging than the finite-dimensional optimisation in Antown et al. (2018).
We show that at an abstract level these problems reduce to the optimisation of a linear continuous functional \({\mathcal {J}}\) on a convex set P of feasible perturbations; this problem has a solution and the solution is unique if the set P of allowed infinitesimal perturbations is strictly convex. The convexity assumption on P is natural because if two different perturbations of the system are possible, then their convex combination (applying the two perturbations with different intensities) will also be possible. After introducing the abstract setting, we construct the objective functions for our two optimal response problems and state general existence and uniqueness results for the optima. Later, in Sect. 5 we focus on the construction of the set of feasible perturbations and provide explicit formulae for the maximising perturbations.
4.1 General Optimisation Setting, Existence and Uniqueness
We recall some general results (adapted for our purposes) on optimising a linear continuous function on convex sets; see also Lemma 6.2 (Froyland et al. 2020). The abstract problem is to find \({\dot{k}}\) such that
where \({\mathcal {J}}:{\mathcal {H}}\rightarrow {{\mathbb {R}}}\) is a continuous linear function, \({\mathcal {H}}\) is a separable Hilbert space and \(P\subset \mathcal {H }\).
Proposition 4.1
(Existence of the optimal solution) Let P be bounded, convex, and closed in \({\mathcal {H}}\). Then, problem considered at (16) has at least one solution.
Proof
Since P is bounded and \({\mathcal {J}}\) is continuous, we have that \(\sup _{k\in P}{\mathcal {J}}(k)<\infty \). Consider a maximising sequence \(k_{n}\) such that \( \lim _{n\rightarrow \infty }{\mathcal {J}}(k_{n})=\sup _{k\in P}{\mathcal {J}}(k)\). Then, \(k_{n}\) has a subsequence \(k_{n_{j}}\) converging in the weak topology. Since P is strongly closed and convex in \({\mathcal {H}}\), we have that it is weakly closed. This implies that \(\overline{k}:=\lim _{j\rightarrow \infty }k_{n_{j}}\in P.\) Also, since \({\mathcal {J}}(k)\) is continuous and linear, it is continuous in the weak topology. Then we have that \({\mathcal {J}}(\overline{k})=\lim _{j\rightarrow \infty }{\mathcal {J}}(k_{n_{j}})=\sup _{k\in P}{\mathcal {J}}(k)\) and we realise a maximum. \(\square \)
Uniqueness of the optimal solution will be provided by strict convexity of the feasible set.
Definition 4.2
We say that a convex closed set \(A\subseteq {\mathcal {H}}\) is strictly convex if for each pair \(x,y\in A\) and for all \(0<\gamma <1\), the points \(\gamma x+(1-\gamma )y\in \mathrm {int}(A)\), where the relative interiorFootnote 3 is meant.
Proposition 4.3
(Uniqueness of the optimal solution) Suppose P is closed, bounded, and strictly convex subset of \({{\mathcal {H}}}\), and that P contains the zero vector in its relative interior. If \({\mathcal {J}}\) is not uniformly vanishing on P then the optimal solution to (16) is unique.
Proof
Suppose that there are two distinct maxima \({\dot{k}}_1,{\dot{k}}_2\in P\) with \({\mathcal {J}}({\dot{k}}_1)={\mathcal {J}}({\dot{k}}_2)=\alpha \). Let \(0<\gamma <1\) and set \(z=\gamma {\dot{k}}_1+(1-\gamma ){\dot{k}}_2\). By strict convexity of P, \(z\in \mathrm {int}(P)\), and by linearity of \({\mathcal {J}}\), \({\mathcal {J}}(z)=\alpha \). Let \(B_r(z)\) denote a (relative in P) open ball of radius r centred at z, with \(r>0\) chosen small enough so that \(B_r(z)\subset \mathrm {int}(P)\). Because the zero vector lies in the relative interior of P, and \({\mathcal {J}}\) does not uniformly vanish on P, there exists a vector \(v\in B_r(z)\) such that \({\mathcal {J}}(v)>0\). Now \(z+\frac{rv}{2\Vert v\Vert }\in \mathrm {int}(P)\) and \({\mathcal {J}}(z+\frac{rv}{2\Vert v\Vert })>\alpha \), contradicting maximality of \({\dot{k}}_1\). \(\square \)
In the following subsections we apply the general results of this section to our specific optimisation problems.
4.2 Optimising the Response of the Expectation of an Observable
Let \(c\in L^2\) be a given observable. We consider the problem of finding an infinitesimal perturbation that maximises the expectation of c. The perturbations we consider are perturbations to the kernels of Hilbert–Schmidt integral operators, of the form (9). If we denote the average of c with respect to the perturbed invariant density \(f_{\delta }\) by
we have
where the last equality follows from Corollary 3.5 and (14).
The function \({\mathcal {J}}({\dot{k}})=\langle c,R({\dot{k}})\rangle \) is clearly continuous as a map from \((V_{\ker },\Vert \cdot \Vert )_{L^2([0,1]^2)}\) to \({\mathbb {R}}\). Suppose that P is a closed, bounded, convex subset of \(V_{\ker }\) containing the zero perturbation, and that \({\mathcal {J}}\) is not uniformly vanishing on P. We wish to solve the following problem:
General Problem 1
Find \({\dot{k}}\in P\) such that
We may immediately apply Proposition 4.1 to obtain that there exists a solution to (17). If, in addition, P is strictly convex, then by Proposition 4.3 the solution to (17) is unique.
To end this subsection we note that without loss of generality, we may assume that \(c\in \) span\(\{f_0\}^\perp \). This is because for \(c\in L^{2}\), we have
since \(R({\dot{k}})\in V\). From \(\int f_0(x) \mathrm{d}x = 1,\) we have that \(f\mapsto \langle f,f_0\rangle _{L^{2}([0,1],{{\mathbb {R}}})}{\mathbf {1}}\) is a projection onto span\(\{{\mathbf {1}}\}\) and so \(f\mapsto f- \langle f,f_0\rangle _{L^{2}([0,1],{{\mathbb {R}}})}{\mathbf {1}}\) is a projection onto span\( \{f_0\}^\perp \).
4.3 Optimising the Response of the Rate of Mixing
We now consider the linear response problem of optimising the rate of mixing. Let \(\lambda _0\in {{\mathbb {C}}}\) denote an eigenvalue of \(L_0\) strictly inside the unit circle with largest magnitude. From now on, whenever discussing the linear response of eigenvalues to kernel perturbations we assume the conditions of Corollary 3.6. We recall that e and \({\hat{e}}\) are the eigenfunctions of \(L_0\) and \(L_0^*\), respectively, corresponding to the eigenvalue \(\lambda _0\).
To find the kernel perturbations that enhance mixing, we follow the general approach taken in Antown et al. (2018) (see also Froyland and Santitissadeekorn 2017; Froyland et al. 2020 in the continuous time setting), namely perturbing our original dynamics \(L_0\) in such a way that the modulus of the second eigenvalue of the perturbed dynamics decreases. Equivalently, we want to decrease the real part of the logarithm of the perturbed second eigenvalue. The following result provides an explicit formula for this instantaneous rate of change. Define
Lemma 4.4
One has
Proof
From (15), we have that
and
Next, we note that
\(\square \)
The function \({\mathcal {J}}({\dot{k}})=\langle {\dot{k}},E\rangle \) is clearly continuous as a map from \((V_{\ker },\Vert \cdot \Vert _{L^2([0,1]^2)})\) to \({\mathbb {R}}\). As in Sect. 4.2, suppose that P is a closed, bounded, strictly convex subset of \(V_{\ker }\) containing the zero element, and that \({\mathcal {J}}\) is not uniformly vanishing on P. We wish to solve the following problem:
General Problem 2
Find \({\dot{k}}\in P\) such that
We may immediately apply Proposition 4.1 to obtain that there exists a solution to (17). If, in addition, P is strictly convex, then by Proposition 4.3 the solution to (22) is unique.
5 Explicit Formulae for the Optimal Perturbations
Thus far we have not been specific about the feasible set P; we take up this issue in this and the succeeding subsections to provide explicit formulae for the optimal responses in both problems (17) and (22). First, we have not required that the perturbed kernel \(k_\delta \) in (9) be nonnegative for \(\delta >0\), however, this is a natural assumption. To facilitate this, for \(0<l<1\), define
The set of allowable perturbations that we will consider in the sequel is
where \(B_1\) is the closed unit ball in \(L^2([0,1]^2)\). For modelling purposes, one may use also the parameter l to restrict the class of allowed perturbations to those that are more likely to occur according to \(k_0\). Note that in the particular situation where the support of \(k_0\) is sparse—for example when significant determinism is present—this sparsity will be respected by the perturbations in \(P_l\).
We now begin verifying the conditions on \(P_l\) and \({\mathcal {J}}\) required by Proposition 4.3. First, \(P_l\) is clearly bounded in \(L^2([0,1]^2)\). Second, we note that as long as \(F_l\) has positive Lebesgue measure, the zero kernel is in the relative interior of \(P_l\). Third, the following lemma handles closedness of \(P_l\). Fourth, from this, since \(V_{\ker }\) and \(S_{k_0,l}\) are closed subspaces, \( V_{\ker }\cap S_{k_0,l}\) is itself a Hilbert space, and hence, \(P_l\) is strictly convex. Finally, sufficient conditions for the objective function to not uniformly vanish are given in Lemma 5.2.
Lemma 5.1
The set \(S_{k_0,l}\) is a closed subspace of \(L^2([0,1]^2)\).
Proof
The fact that \(S_{k_0,l}\) is a subspace is trivial. Let \(\{k_n\}\subset S_{k_0,l}\) and suppose \(k_n\rightarrow _{L^2} k\in L^2([0,1]^2)\). Further suppose \(\{(x,y)\in [0,1]^2: k_0(x,y)< l\}\) is not a null set; otherwise \(S_{k_0,l}=L^2([0,1]^2)\) and the result immediately follows. Then, we have
Since \(\int _{\{k_0\ge l\}} (k_n(x,y)-k(x,y))^2\mathrm{d}y\mathrm{d}x\ge 0\), if \(\int _{\{k_0< l\}} k(x,y)^2 \mathrm{d}x\mathrm{d}y>0\) then we obtain a contradiction; thus, \( \int _{\{k_0< l\}} k(x,y)^2 \mathrm{d}x\mathrm{d}y=0\) and therefore \(k=0\) a.e. on \(\{(x,y)\in [0,1]^2: k_0(x,y)< l\}\). Hence, \(S_{k_0,l}\) is closed. \(\square \)
Let
and for \(F_l\subset [0,1]^2\), define
The following lemma provides sufficient conditions for a functional of the general form we wish to optimise to not uniformly vanish. The general objective has the form \({\mathcal {J}}({\dot{k}})=\int \int {\dot{k}}(x,y){\mathcal {E}}(x,y)\ \mathrm{d}y\ \mathrm{d}x\); in our first specific objective (optimising response of expectations) we put \({\mathcal {E}}(x,y)=((\text {Id}-L_0^*)^{-1}c)(x)\cdot f_0(y)\) and in our second specific objective (optimising mixing) we put \({\mathcal {E}}(x,y)=E(x,y)\) from (18). Let \({\mathcal {E}}^+\) and \({\mathcal {E}}^-\) denote the positive and negative parts of \({\mathcal {E}}\). For \(y\in \Xi (F_l)\), let \(A(y)=\int _{F_l^y} {\mathcal {E}}^+(x,y)\ \mathrm{d}x\) and \(a(y)=\int _{F_l^y} {\mathcal {E}}^-(x,y)\ \mathrm{d}x\).
Lemma 5.2
Assume that there is \(\Xi '\subset \Xi (F_l)\) such that \(m(\Xi ')>0\) and \(A(y),a(y)>0\) for \(y\in \Xi '\). Then there is a \({\dot{k}}\in P_l\) such that \({\mathcal {J}}({\dot{k}})>0\).
Proof
For \(y\in \Xi (F_l)\), set \({\dot{k}}(x,y)={\mathbf {1}}_{F_l^y}(x)\left( a(y){\mathcal {E}}^+(x,y)-A(y){\mathcal {E}}^-(x,y)\right) \). To show \({\dot{k}}\in P_l\) we need to check that (i) the support of \({\dot{k}}\) is contained in \(F_l\) and (ii) \(\int _{F_l^y} {\dot{k}}(x,y)\ \mathrm{d}x=0\) for a.e. \(y\in \Xi (F_l)\); these points show \({\dot{k}}\in S_{k_0,l}\cap V_{\ker }\) and by trivial scaling we may obtain \({\dot{k}}\in B_1\). Item (i) is obvious from the definition of \({\dot{k}}\). For item (ii) we compute
Finally, we check that \({\mathcal {J}}({\dot{k}})>0\). One has
This final expression is positive due by the hypotheses of the Lemma. \(\square \)
Remark 5.3
We note that in the situation where \({\mathcal {E}}(x,y)\) is in separable form \({\mathcal {E}}(x,y)=h_1(x)h_2(y)\)—as in the case of optimising the derivative of the expectation of an observable c , and in the case of optimising the derivative of a real eigenvalue—then \(A(y)=h_2(y)\int _{F_l^y} h_1^+(x)\ \mathrm{d}x\) and \(a(y)=h_2(y)\int _{F_l^y} h_1^-(x)\ \mathrm{d}x\). Because \(h_2=f_0\) and \(h_2=e\) are not the zero function, and \(h_1=(\text {Id}-L_0^*)^{-1}c\) and \(h_1={\hat{e}}\) are both nontrivial signed functions, the conditions of Lemma 5.2 are relatively easy to satisfy.
5.1 Maximising the Expectation of an Observable
In this section, we provide an explicit formula for the optimal kernel perturbation to increase the expectation of an observation function c by the greatest amount. Since the objective function in (17) is linear in \({\dot{k}}\), a maximum will occur on \(\partial B_1\cap V_{\ker }\cap S_{k_0,l}\) (i.e. we only need to consider the optimisation over the unit sphere and not the unit ball). Thus, we consider the following reformulation of the general problem 1:
Problem A
Given \(l > 0 \) and \(c\in \) span\(\{f_0\}^\perp \), solve
Our first main result is:
Theorem 5.4
Let \(L_0:L^2\rightarrow L^2\) be an integral operator with the stochastic kernel \(k_0\in L^2([0,1]^2)\). Suppose that \(L_0\) satisfies (A1) of Theorem 2.2 and that there is a \(\Xi '\subset \Xi (F_l)\) with \(m(\Xi ')>0\) and \(f_0(y)>0, \int _{F_l^y} ((\text {Id}-L_0^*)^{-1}c)^+(x)\ \mathrm{d}x>0\), and \(\int _{F_l^y} ((\text {Id}-L_0^*)^{-1}c)^-(x)\ \mathrm{d}x>0\) for \(y\in \Xi '\). Then the unique solution to Problem A is
where \(\alpha >0\) is selected so that \(\Vert {\dot{k}} \Vert _{L^2([0,1]^2)}=1\). Furthermore, if \(c\in W:=\) span\(\{f_0\}^\perp \cap L^\infty \), \(k_0\in L^\infty ([0,1]^2)\), and \(k_0\) is such that \( L_0:L^1\rightarrow L^1\) is compact, then \({\dot{k}}\in L^\infty ([0,1]^2)\).
Proof
See Appendix A. \(\square \)
Note that the expression for the optimal perturbation \({\dot{k}}\) in (28) depends only on \(k_0\) and c. This is in part a consequence of the fact that the linear response formula (12) depends only on the first-order term \({\dot{k}}\) (the “direction” of the perturbation) in the expansion of \(k_\delta \). Thus, in order to find the unique perturbation that optimises our linear response, we seek the best “direction” for the perturbation. Similar comments hold for our other three optimal linear perturbation results in later sections.
Remark 5.5
In certain situations we may desire to make non-infinitesimal perturbations \(k_\delta := k_0 + \delta \cdot {\dot{k}}\) that remain stochastic for small \(\delta >0\). If \({\dot{k}}\in L^\infty ([0,1]^2)\cap V_{\ker }\cap S_{k_0,l}\), clearly \(k_\delta = k_0 + \delta \cdot {\dot{k}}\) satisfies \(\int k_\delta (x,y) \mathrm{d}x =1\) for a.e. y. Also, as we are only perturbing at values where \(k_0\ge l>0 \), and since \({\dot{k}}\) is essentially bounded, there exists a \({\bar{\delta }}>0\) such that \(k_\delta \ge 0\) a.e. for all \( \delta \in (0,{{\bar{\delta }}})\). In summary, for \(\delta \in (0,{\bar{\delta }})\), \(k_\delta \) is a stochastic kernel.
The compactness condition on \(L_0:L^1\rightarrow L^1\) required for essential boundedness of \({\dot{k}}\) can be addressed as follows. A criterion for \(L_0\) to be compact on \(L^1([0,1])\) is the following (see Eveson 1995): Given \( \varepsilon >0\) there exists \(\beta >0\) such that for a.e. \(y\in [0,1]\) and \( \gamma \in {{\mathbb {R}}}\) with \(|\gamma |<\beta \),
where \({\tilde{k}}:{\mathbb {R}}\times [0,1]\rightarrow {\mathbb {R}}\) is defined by
A class of kernels that satisfy this are essentially bounded kernels \(k_0:[0,1]\times [0,1]\rightarrow {{\mathbb {R}}}\) that are uniformly continuous in the first coordinate. Such a class naturally arises in our dynamical systems settings.
5.2 Maximally Increasing the Mixing Rate
Let \(\lambda _0\in {{\mathbb {C}}}\) denote a geometrically simple eigenvalue of \(L_0\) strictly inside the unit circle and e and \({\hat{e}}\) denote the corresponding eigenvectors of \(L_0\) and \(L_0^*\), respectively. Our results concerning optimal rate of movement of \(\lambda _0\) under system perturbation work for any \(\lambda _0\) as above, but eigenvalues of largest magnitude inside the unit circle have the additional significance of controlling the exponential rate of mixing. We therefore primarily focus on these eigenvalues, and in this section, we consider again the linear response problem for enhancing the rate of mixing, now providing explicit formulae for optimal perturbations and the response.
Since we are again interested in kernel perturbations that will ensure that the perturbed kernel \(k_\delta \) is nonnegative, we consider the constraint set \(P_l\), as in Sect. 4.1, where \(0<l<1\). The objective function of (22) is linear and therefore, we only need to consider the optimisation problem on \(V_{\ker }\cap S_{k_0,l}\cap \partial B_1\). Thus, to obtain the perturbation \({\dot{k}}\) that will enhance the mixing rate, we solve the following optimisation problem:
Problem B
Given \(l > 0\), solve
where E is defined in (18).
Theorem 5.6
Let \(L_0:L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\) be an integral operator with the stochastic kernel \(k_0\in L^2([0,1]^2,{{\mathbb {R}}})\). Suppose that \(L_0\) satisfies (A1) of Theorem 2.2 and that there is a \(\Xi '\subset \Xi (F_l)\) with \(m(\Xi ')>0\), and \(\int _{F_l^y} E(x,y)^+\ \mathrm{d}x>0\) and \(\int _{F_l^y} E(x,y)^-\ \mathrm{d}x>0\) for \(y\in \Xi '\). Then, the unique solution to Problem B is
where E is given in (18) and \(\alpha >0\) is selected so that \(\Vert {\dot{k}}\Vert _{L^2([0,1]^2,{{\mathbb {R}}})}=1\). Furthermore, if \(k_0\in L^\infty ([0,1]^2,{{\mathbb {R}}})\) then \({\dot{k}}\in L^\infty ([0,1]^2,{{\mathbb {R}}})\).
Proof
See Appendix B. \(\square \)
If \(\lambda _0\) is real, the optimal kernel has a simpler form:
Corollary 5.7
If \(\lambda _{0}\) is real and \( k_0\ge l\), then the solution to Problem B is
Proof
We have \(E(x,y) = \lambda _0{\hat{e}}(x)e(y)\); thus, the solution to the optimisation problem (29)- (30) is
where \(\alpha >0\) is the normalisation constant such that \(\Vert {\dot{k}} \Vert _{L^2([0,1]^2,{{\mathbb {R}}})}^2=1\). \(\square \)
6 Linear Response for Map Perturbations
In this section, we consider random dynamics governed by the composition of a deterministic map \( T_{\delta }\), \(\delta \in [0,{\bar{\delta }})\), and additive i.i.d. stochastic perturbations, or “additive noise”. We will assume that the noise is distributed according to a certain Lipschitz kernel \(\rho \) and impose a reflecting boundary condition that ensures that the dynamics remain in the interval [0, 1]. More precisely, we consider a random dynamical system whose trajectories are given by
where \(\hat{+}\) is the “boundary reflecting" sum, defined by \(a\hat{+}b:=\pi (a+b)\), and \(\pi :{\mathbb {R}}\rightarrow [0,1]\) is the piecewise linear map \(\pi (x)=\min _{i\in {\mathbb {Z}}}|x-2i|\). We assume throughout that
-
(T1)
\(T_{\delta }:[0,1]\rightarrow [0,1]\) is a Borel-measurable map for each \(\delta \in [0,{{\bar{\delta }}})\),
-
(T2)
\(\omega _{n}\) is an i.i.d. process distributed according to a probability density \(\rho \in Lip({\mathbb {R}})\), supported on \([-1,1]\) with Lipschitz constant K.
6.1 Expressing the Map Perturbation as a Kernel Perturbation
In this subsection we describe precisely the kernel of the transfer operator of the system (33). Associated with the process (33) is an integral-type transfer operator \(L_\delta \), which we will derive (following the method of §10.5 in Lasota and Mackey 1985). Noting that \(|\pi '(z)|=1\) for all \(z\in {\mathbb {R}}\), the Perron-Frobenius operator \(P_\pi :L^1({\mathbb {R}})\rightarrow L^1([0,1])\) associated to the map \(\pi \) is given by
For \(b\in {{\mathbb {R}}}\) consider the shift operator \(\tau _b\) defined by \((\tau _b g)(y):=g(y+b)\) for \(g\in Lip({\mathbb {R}})\). For the process (33), suppose that \(x_n\) has the distribution \(f_n:[0,1]\rightarrow {\mathbb {R}}^+\) (i.e. \(f_n\in L^1,\ f_n\ge 0\) and \(\int f_n\ \mathrm{d}m =1\)). We note that \(T_\delta (x_n)\) and \(\omega _{n}\) are independent and thus the joint density of \((x_n,\omega _n)\in [0,1]\times [-1,1]\) is \(f_n\cdot \rho \). Let \(h:[0,1]\rightarrow {{\mathbb {R}}}\) be a bounded, measurable function and let \({\mathbb {E}}\) denote expectation with respect to Lebesgue measure; we then compute
where the last equality follows from the duality of the Perron-Frobenius and the Koopman operators for \(\pi \). Since \({\mathbb {E}}(h(x_{n+1})) = \int _{0}^1 h(x) f_{n+1}(x)\mathrm{d}x\), and h is arbitrary, the map \(f_n\mapsto f_{n+1}\) is given by
for all \(z'\in [0,1]\). Thus, for \(\delta \in [0,{\bar{\delta }})\) the integral operator \(L_\delta :L^2([0,1])\rightarrow L^2([0,1])\) associated to the process (33) is given by
where
and \(x,y\in [0,1]\).
Lemma 6.1
The kernel (36) is a stochastic kernel in \(L^\infty ([0,1]^2)\).
Proof
Stochasticity and nonnegativity of \(k_\delta \) follow from stochasticity and nonnegativity of \(\rho \) and the fact that Perron-Frobenius operators preserve these properties. Essential boundedness of \(k_\delta \) follows from the facts that \(\rho \) is Lipschitz (thus essentially bounded), \(\tau \) is a shift, and \(P_\pi \) is constructed from a finite sum because \(\rho \) has compact support. \(\square \)
Proposition 6.2
Assume that \(k_{\delta }\) arising from the system \((T_{\delta },\rho )\) is given by (36). Suppose that the family of interval maps \(\{T_\delta \}_{\delta \in [0,{{\bar{\delta }}})}\) satisfies
where \({\dot{T}},t_\delta \in L^{2}\) and \(\Vert t_\delta \Vert _{2} =o(\delta )\). Then
where \({\dot{k}}\in L^2([0,1]^2)\) is given by
and \(r_\delta \in L^{2}([0,1]^{2})\) satisfies \(\Vert r_\delta \Vert _{L^2([0,1]^2)}=o(\delta )\).
If additionally, \(\mathrm{d}\rho /\mathrm{d}x\) is Lipschitz and the derivative of the map \(\delta \mapsto T_\delta \) with respect to \(\delta \) varies continuously in \(L^2\) in a neighborhood of \(\delta =0\), then \(\delta \mapsto k_\delta \) has a continuous derivative with respect to \(\delta \) in a neighborhood of \(\delta =0\).
Proof
We show that \(\Vert k_{\delta }(x,y)-k_{0}(x,y) - \delta \cdot {\dot{k}}(x,y)\Vert _{L^2([0,1]^2)}=o(\delta )\), where \({\dot{k}}\) is as in (37). We have
We begin by showing that the first term on the right hand side of (38) is \(o(\delta )\). Since \(\rho \) is Lipschitz with constant K, one has
Because the support of \(\tau _{-(T_\delta (y))}\rho -\tau _{-(T_0(y)+\delta \cdot {\dot{T}}(y))}\rho \) is contained in 2 intervals, each of length 2, by (39) and Lemma C.1, we therefore see that
Next we show that the second term on the right hand side of (38) is \(o(\delta )\). Using the definition of the derivative and the fact that \(\rho \) is differentiable a.e. we see that
for a.e. x, y. Since \(\bigg |\frac{\rho (x-T_{0}(y)-\delta \cdot {\dot{T}} (y))-\rho (x-T_{0}(y))}{\delta }\bigg |\le K{\dot{T}}(y)\), by dominated convergence the limit (40) also converges in \(L^{2}.\) Hence, applying Lemma C.1 to the second term on the right hand side of (38), noting that \(D(\delta )\) in (40) is square-integrable and supported in at most 3 intervals of length at most 2, we obtain
Regarding the final statement, suppose that \(\delta \mapsto T_\delta \) has a continuous derivative with respect to \(\delta \) at a neighborhood of \(\delta =0\). This implies that \({\dot{T}}\) exists and varies continuously on a small interval \([0,\delta ^*]\), with \(0<\delta ^*\le {\bar{\delta }}\). Denote the derivative \(\mathrm{d}T_\delta /\mathrm{d}\delta \) at \(\delta \) by \({\dot{T}}_\delta \), and similarly for \({\dot{k}}\). One has
where the final inequality follows from Lemma C.1 applied to each term in the previous line, noting that \(\rho \) is supported in a single interval of length 2. The first term in the final inequality goes to zero as \(\delta \rightarrow 0\) by continuity of \({\dot{T}}\), and the second term goes to zero as \(\delta \rightarrow 0\) since \(\Vert r_\delta \Vert _2\rightarrow 0\). \(\square \)
6.2 A Formula for the Linear Response of the Invariant Probability Density and Continuity with Respect to Map Perturbations
By considering the kernel form of map perturbations, we can apply Corollary 3.5 to obtain the following.
Proposition 6.3
Let \(L_\delta :L^2\rightarrow L^2\), \(\delta \in [0,{\bar{\delta }})\), be the integral operators in (35) with the kernels \(k_\delta \) as in (36). Suppose that \(L_0\) satisfies (A1) of Theorem 2.2. Then the kernel \({\dot{k}}\) in (37) is in \(V_{\ker }\) and
with convergence in \(L^{2}.\)
Proof
The result is a direct application of Corollary 3.5; we verify its assumptions. From Lemma 6.1, \(k_\delta \in L^2([0,1]^2)\) is a stochastic kernel and so \(L_\delta \) is an integral-preserving compact operator. From Proposition 6.2, \(k_\delta \) has the form (9). Thus, we can apply Corollary 3.5 to obtain the result. \(\square \)
Remark 6.4
If T is coveringFootnote 4 and \(\rho \) is strictly positive in a neighbourhood of zero one can show the corresponding transfer operator \(L_0\) satisfies assumption (A1) of Theorem 2.2, using arguments similar to e.g. Zmarrou and Homburg (2007) Proposition 8.1, Froyland (2013) Lemmas 3 and 10, or Galatolo and Giulietti (2019), Lemma 41. Let \(f\in L^1\) have zero average: \(\int _{[0,1]} f=0\). If f is 0 almost everywhere, \(L_0^n(f)= 0\) and we are done. Otherwise, given \(\epsilon <0\), we can find an \(f_1\) such that \(\Vert f-f_1\Vert _1<\epsilon \) and \(f_1\) is positive in some small interval \(I\subset [0,1]\). Since \(\rho \) is positive in a neighbourhood of zero, \({\mathrm {supp}}(L_0(f_1^+))\supset T(I)\). By the covering condition there is some \(n'\in {{\mathbb {N}}}\) such that \({\mathrm {supp}}(L_0^{n'}(f_1^+))=[0,1]\). It is then standard to deduce that there is an \(n_0\ge n'\) such that \(\Vert L_0^n(f_1)\Vert _1<\epsilon \) for \(n\ge n_0\). Since the transfer operator contracts the \(L^1\) norm, then \(||L_0^n f ||_1 \le 2 \epsilon \) for \(n\ge n_0\) and since \(\epsilon \) was arbitrary, this implies that \(L_0\) satisfies (A1).
Let the linear response \({\widehat{R}}:L^2\rightarrow L^2\) of the invariant density be defined as
Lemma 6.5
The function \({\widehat{R}}:L^2\rightarrow L^2\) is continuous.
Proof
We have
where \({\tilde{k}}(x,y) := \left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)f_0(y)\). Since \(\frac{\mathrm{d}\rho }{\mathrm{d}x}\in L^\infty \), we have \(\left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)\in L^\infty ([0,1]^2)\). From inequality (7), we then have \(f_0\in L^\infty \) and so \({\tilde{k}}\in L^\infty ([0,1]^2)\). We finally have
\(\square \)
6.3 A Formula for the Linear Response of the Dominant Eigenvalues and Continuity with Respect to Map Perturbations
We are also able to express the linear response of the dominant eigenvalues as a function of the perturbing map \({\dot{T}}\). Define
Proposition 6.6
Let \(L_\delta :L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\), \(\delta \in [0,{\bar{\delta }})\), be integral operators generated by the kernels \(k_\delta \) as in (36), assume that \(\mathrm{d}\rho /\mathrm{d}x\) is Lipschitz and \(\delta \mapsto T_\delta \) is \(C^1\). Let \(\lambda _{\delta }\) be an eigenvalue of \(L_\delta \) with second largest magnitude strictly inside the unit disk. Suppose that \(L_{0}\) satisfies (A1) of Theorem 2.2 and \(\lambda _{0}\) is geometrically simple. Then
where e is the eigenvector of \(L_0\) associated to the eigenvalue \( \lambda _{0} \) and \({\hat{e}}\) is the eigenvector of \(L_0^*\) associated to the eigenvalue \(\lambda _{0}\).
Proof
Since \(k_\delta \in L^2([0,1]^2,{{\mathbb {R}}})\), \(L_\delta :L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\) is compact. From Lemma 6.1 we have that \(k_\delta \) is a stochastic kernel and so \(L_\delta \) preserves the integral (i.e. it satisfies (3)). By Proposition 6.2 the kernel \(k_\delta \) is in the form (9) and the map \(\delta \mapsto k_\delta \) is \(C^1\). By Lemma 3.4 we see that \(\delta \mapsto L_\delta \) is \(C^1\), where the derivative operator \({\dot{L}}\) is the integral operator with the kernel \({\dot{k}}\). Using the assumption that \(L_0\) is mixing and \(\lambda _{0}\) is geometrically simple, we apply Proposition 2.6 to obtain \(\frac{\mathrm{d}\lambda _{\delta }}{\mathrm{d}\delta }\big |_{\delta =0} = \langle {\hat{e}}, {\dot{L}} e\rangle _{L^2([0,1],{\mathbb {C}})}\). Finally, we compute
\(\square \)
From (42), the linear response of the dominant eigenvalues is continuous with respect to map perturbations.
Lemma 6.7
The eigenvalue response function \({\check{R}}:L^2\rightarrow {\mathbb {C}}\) given by \({\check{R}}({\dot{T}})=\langle H,{\dot{T}}\rangle \) is continuous.
Proof
This follows from Cauchy-Schwarz and the fact that \(H\in L^2([0,1],{{\mathbb {C}}})\); the latter claim follows from the fact that \(\left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)\in L^\infty ([0,1]^2,{{\mathbb {R}}})\) (see proof of Lemma 6.5) and that \(e,{\hat{e}}\in L^\infty ([0,1],{{\mathbb {C}}})\) (which follows from (7) and the fact that \(k_0\in L^\infty ([0,1]^2,{{\mathbb {R}}})\), see Lemma 6.1). \(\square \)
7 Optimal Linear Response for Map Perturbations
In this section, we derive formulae for the map perturbations that maximise our two types of linear response. We begin by formalising the set of allowable map perturbations then state the formulae.
7.1 The Feasible Set of Map Perturbations
Before we formulate the optimisation problem, we note that in this setting, we require some restriction on the space of allowable perturbations to \(T_0\) if we are to interpret \(T_0+\delta {\dot{T}}\) as a map of the unit interval for some \(\delta \) strictly greater than 0 (a non-infinitesimal map perturbation). With this in mind, let \(\ell >0\) and \({\widetilde{F}}_\ell :=\{x\in [0,1]:\ell \le T_0(x)\le 1-\ell \}\); it will turn out that we obtain for free that \({\dot{T}}\in L^\infty \). Note that in principle, \(\ell >0\) can be taken as small as one likes, and indeed if one wishes to consider only infinitesimal map perturbations \({\dot{T}}\) then one may set \({\widetilde{F}}_\ell ={\widetilde{F}}_0=[0,1]\). Of course if \(T:S^1\rightarrow S^1\) then may may use \({\widetilde{F}}_\ell ={\widetilde{F}}_0=[0,1]\) even for non-infinitesimal perturbations. Recalling that in Proposition 6.2 we are considering \(L^2\) perturbations \({\dot{T}}\) of the map \(T_0\), we define
Lemma 7.1
\(S_{T_0,\ell }\) is a closed subspace of \(L^2\).
Proof
It is clear that \(S_{T_0,\ell }\) is a subspace. To show it is closed, let \(\{f_n\}\subset S_{T_0,\ell }\) and suppose that \(f_n\rightarrow _{L^2} f\in L^2\). Further, suppose that \({\widetilde{F}}_\ell \) is not [0, 1] up to measure zero; otherwise \(S_{T_0,\ell }=L^2\), which is closed. Then, we have
If \(\int _{{\widetilde{F}}_\ell ^c}f(x)^2\mathrm{d}x>0\), we obtain a contradiction since \(\int _{{\widetilde{F}}_\ell }(f_n(x)-f(x))^2\mathrm{d}x\ge 0\); thus, \(\int _{{\widetilde{F}}_\ell ^c}f(x)^2\mathrm{d}x=0\) and so \(f=0\) a.e. on \({\widetilde{F}}_\ell ^c\). Hence, \(S_{T_0,\ell }\) is closed. \(\square \)
For the remainder of this section, the set of allowable map perturbations that we consider is
where \(B_1\) is the unit ball in \(L^2\). Since \(S_{T_0,\ell }\) is a closed subspace of \(L^2\), it is itself a Hilbert space and so \(P_\ell \) is strictly convex. The following lemma concerns the existence of a perturbation \({\dot{T}}\) for which our objectives will be nonzero; that is, our objective \({\mathcal {J}}\) is not uniformly vanishing. Denote \({\mathcal {P}}(x,y):=P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) (x)\) and let
be our objective. In our first specific objective (optimising response of expectations) we will insert \({\mathcal {E}}(x,y)=((\text {Id}-L_0^*)^{-1}c)(x)f_0(y)\) and in our second specific objective (optimising mixing) we will insert \({\mathcal {E}}(x,y)=E(x,y)\) from (18).
Lemma 7.2
Assume that there is \(F'\subset {\widetilde{F}}_\ell \) such that \(m(F')>0\) and \({\mathcal {E}}(\cdot ,y)\notin {\mathrm {span}}\{{\mathcal {P}}(\cdot ,y)\}^\perp \) for all \(y\in F'\). Then there is a \({\dot{T}}\in P_\ell \) such that \({\mathcal {J}}({\dot{T}})>0\).
Proof
Because
we may set \({\dot{T}}(y)=\int _0^1{\mathcal {P}}(x,y){\mathcal {E}}(x,y)\ \mathrm{d}x\) for \(y\in F'\) and \({\dot{T}}(y)=0\) otherwise to obtain \({\mathcal {J}}({\dot{T}})>0\). Trivial scaling yields \({\dot{T}}\in B_1\). \(\square \)
We expect the hypotheses of Lemma 7.2 to be satisfied “generically”.
7.2 Explicit Formula for the Optimal Map Perturbation that Maximally Increases the Expectation of an Observable
In this section, we consider the problem of finding the optimal map perturbation that maximises the expectation of some observable \(c\in L^2\). We first present a result that ensures a unique solution exists and then derive an explicit expression for the optimal map perturbation.
We begin by noting that \({\widehat{R}}({\dot{T}})\in V\); this follows from the fact that \(\left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)f_0(y)\in V_{\ker }\) (since \({\dot{k}}\in V_{\ker }\), see Proposition 6.3) and therefore \(\int _0^1 \left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)f_0(y) g(y)\mathrm{d}y\in V\) for \(g\in L^2\) (see Lemma 3.2). Hence, we only need to consider \(c\in \) span\(\{f_0\}^\perp \) (see the discussion at the end of Sect. 4.2).
Proposition 7.3
Let \(c\in \) span\(\{f_0\}^\perp \) and \(P_{\ell }\) be the set in (44). Assume that the function \({\mathcal {J}}({\dot{T}}):=\big \langle c,{\widehat{R}}({\dot{T}})\big \rangle _{L^{2}([0,1],{{\mathbb {R}}})}\) is not uniformly vanishing on \(P_\ell \). Then the optimisation problem
where \({\widehat{R}}\) is as in (41), has a unique solution \({\dot{T}}\in L^2\).
Proof
Let \({\mathcal {H}} = L^2\), \(P=P_\ell \) and \({\mathcal {J}}({\dot{h}}) = \langle c,{\widehat{R}}({\dot{h}})\rangle _{L^2([0,1],{{\mathbb {R}}})}\). Using Lemma 7.1 we note that \(P_\ell \) is closed, as well as bounded, strictly convex and that it contains the zero element of \({\mathcal {H}}\). From Lemma 6.5, it follows that \(\langle c,{\widehat{R}}({\dot{h}} )\rangle _{L^2([0,1],{{\mathbb {R}}})}\) is continuous as a function of \({\dot{h}}\); note that it is also linear in \({\dot{h}}\). By hypothesis, \({\mathcal {J}}\) is not uniformly vanishing on \(P_\ell \). We can therefore apply Propositions 4.1 and 4.3 to conclude that (45) has a unique solution. \(\square \)
Before we present the explicit formula for the optimal solution, we will reformulate the optimisation problem (45) to simplify the analysis. We first note that since the objective function in (45) is linear in \({\dot{T}}\), the maximum will occur on \(S_{T_0,\ell }\cap \partial B_1\). Combining this with the fact that we only need \(c\in \) span\(\{f_0\}^\perp \), we consider the following reformulation of (45):
Problem C
Given \(\ell \ge 0 \) and \(c\in \) span\(\{f_0\}^\perp \) solve
Theorem 7.4
Suppose the transfer operator \(L_0\) associated with the system \((T_0,\rho )\) has a kernel \(k_0\) as in (36), which satisfies (A1) of Theorem 2.2, and there is a \(F'\subset {\widetilde{F}}_\ell \) such that \(m(F')>0\), and \(f_0(y)>0\) and \((\text {Id}-L_0^*)^{-1}c\notin {\mathrm {span}}\{{\mathcal {P}}(\cdot ,y)\}^\perp \) for all \(y\in F'\). Let \({\mathcal {G}}:L^2\rightarrow L^2\) be defined as
Then, the unique solution to Problem C is
Furthermore, \({\dot{T}}\in L^\infty \).
Proof
See Appendix D. \(\square \)
7.3 Explicit Formula for the Optimal Map Perturbation that Maximally Increases the Mixing Rate
In this section, we set up the optimisation problem for mixing enhancement and derive a formula for the optimal map perturbation. We remark that related spectral approaches to mixing enhancement for continuous-time flows were developed in Froyland and Santitissadeekorn (2017), Froyland et al. (2020).
Recall that to enhance mixing in Sect. 5.2, we perturbed \(k_0\) so that the logarithm of the real part of the second eigenvalue decreases. From Lemma 4.4, we have
where \(\lambda _\delta \) denotes the second largest eigenvalue in magnitude (assumed to be simple) of the integral operator \(L_\delta \) with the kernel \(k_\delta = k_0+\delta \cdot {\dot{k}} + o(\delta )\), where \(\delta \mapsto k_\delta \) is \(C^1\) at \(\delta =0\). Since we want to perturb \(T_0\) by \({\dot{T}}\), we reformulate the above inner product. Define
where E(x, y) is as in (18).
Proposition 7.5
Let \(L_\delta :L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\), \(\delta \in [0,{\bar{\delta }})\), be integral operators generated by the kernels \(k_\delta \) as in (36), assume that \(\mathrm{d}\rho /\mathrm{d}x\) is Lipschitz and \(\delta \mapsto T_\delta \) is \(C^1\). Let \(\lambda _{\delta }\) be an eigenvalue of \(L_\delta \) with second largest magnitude strictly inside the unit disk. Suppose that \(L_{0}\) satisfies (A1) of Theorem 2.2 and \(\lambda _{0}\) is geometrically simple. Let e and \({\hat{e}}\) be the eigenvectors of \(L_0\) and \(L_0^*\), respectively, corresponding to the eigenvalue \(\lambda _{0}\). Then \({\widehat{E}}\in L^\infty ([0,1],{{\mathbb {R}}})\) and
Proof
We first show that \({\widehat{E}}\in L^\infty ([0,1],{{\mathbb {R}}})\). We can write
where \(\beta _1=\beta _2 = \Re (\lambda _{0})\), \(\beta _3=-\beta _4 = \Im (\lambda _{0}), g_1=g_4 = \Re ({\hat{e}}), g_2=g_3=\Im ({\hat{e}})\), \(h_1=h_3 = \Re (e),h_2=h_4 = \Im (e)\). From the proof of Theorem 7.4, we have \({\mathcal {G}}g_i \in L^\infty ([0,1],{{\mathbb {R}}})\). Also, from Lemma 6.1, we have that \(k_0\in L^\infty ([0,1]^2)\) and therefore \(h_i\in L^\infty ([0,1],{{\mathbb {R}}})\); thus, \({\widehat{E}}\in L^\infty ([0,1],{{\mathbb {R}}})\).
Finally, we compute
\(\square \)
From equation (50), in order to maximally increase the spectral gap, by Proposition 7.5, we should choose the map perturbation \({\dot{T}}\) to minimise \(\langle {\dot{T}},{\widehat{E}}\rangle \). We first show this optimisation problem has a unique solution.
Proposition 7.6
Let \(P_\ell \) be the set in (44) and assume that \({\mathcal {J}}({\dot{T}})=\langle {\dot{T}},{\widehat{E}}\rangle \) does not uniformly vanish on \(P_\ell \). Then, the problem of finding \({\dot{T}}\in P_\ell \) such that
has a unique solution.
Proof
Note that \(P_\ell \) is closed (by Lemma 7.1), bounded, strictly convex and contains the zero element of \(L^2\). Now, since \({\mathcal {J}}({\dot{h}}) := \langle {\dot{h}},{\widehat{E}}\rangle _{L^2([0,1],{{\mathbb {R}}})}\) is linear and continuous and by hypothesis does not vanish everywhere on \(P_\ell \), we may apply Propositions 4.1 and 4.3 to obtain the result. \(\square \)
Since the objective function in (52) is linear, all optima will lie in \(S_{T_0,\ell }\cap \partial B_1\). Hence, we equivalently consider the following optimisation problem:
Problem D
Given \(\ell \ge 0\), solve
We now state a formula for the unique optimum.
Theorem 7.7
Let \((T_0,\rho )\) be a deterministic system with additive noise satisfying (T1) and (T2). Suppose the associated transfer operator \(L_0:L^2([0,1],{{\mathbb {C}}})\rightarrow L^2([0,1],{{\mathbb {C}}})\), with the kernel \(k_0\) as in (36), satisfies (A1) of Theorem 2.2, and that there is a \(F'\subset {\tilde{F}}_\ell \) with \(m(F')>0\) and \(E(\cdot ,y)\notin {\mathrm {span}}\{{\mathcal {P}}(\cdot ,y)\}^\perp \) for all \(y\in F'\). Suppose \(\lambda _0\) is geometrically simple. Then, the unique solution to the optimisation problem D is
where E(x, y) is as in (18) and \(\alpha >0\) is selected so that \(\Vert {\dot{T}}\Vert _2 =1\). Furthermore, \({\dot{T}}\in L^\infty \).
Proof
See Appendix E. \(\square \)
Corollary 7.8
If \(\lambda _{0}\) is real, then
where \({\mathcal {G}}\) is the operator in (48). Furthermore, if there exists an \(\ell >0\) such that \(\ell \le T_0(x)\le 1-\ell \) for \(x\in [0,1]\), then
Proof
Since \(e, {\hat{e}}\) and \(\lambda _0\) are real, we have \(E(x,y) = {\hat{e}}(x)e(y)\lambda _0\) and the expression for \({\dot{T}}\) follows from (55). Finally, if \(\ell \le T_0(x)\le 1-\ell \), then \({\widetilde{F}}_\ell = [0,1]\) and we have (56). \(\square \)
8 Applications and Numerical Experiments
In this section, we will consider two stochastically perturbed deterministic systems, namely the Pomeau–Manneville map and a weakly mixing interval exchange map. For each of these maps we numerically estimate:
-
1.
The unique kernel perturbation that maximises the change in expectation of a prescribed observation function (see Problem A). An expression for this optimal kernel is given by (28).
-
2.
The unique kernel perturbation that maximally increases the mixing rate (see Problem B). An expression for this optimal kernel is given by (31) and (32).
-
3.
The unique map perturbation that maximises the change in expectation of a prescribed observation function (see Problem C). An expression for this optimal map perturbation is given by (49).
-
4.
The unique map perturbation that maximally increases the mixing rate (see Problem D). An expression for this optimal map perturbation is given by (55) and (56).
The numerics will be explained as we proceed through these four optimisation problems. We refer the reader to Antown et al. (2018) for additional details on the implementation and related experiments.
8.1 Pomeau-Manneville Map
We consider the Pomeau-Manneville map (Liverani et al. 1999)
with parameter value \(\alpha =1/2\). For this parameter choice it is known that the map \(T_0\) admits a unique absolutely continuous invariant probability measure, but only algebraic decay of correlations (Liverani et al. 1999). With the addition of noise as per (33), the transfer operator defined by (35) and (36) for \(\delta =0\) becomes compact as an operator on \(L^2\). In our numerical experiments we will use the smooth noise kernel \(\rho _\epsilon :[-\epsilon ,\epsilon ]\rightarrow {\mathbb {R}}\), defined by \(\rho _\epsilon (x)=N(\epsilon )\exp (-\epsilon ^2/(\epsilon ^2-x^2))\), where \(N(\epsilon )\) is a normalisation factor ensuring \(\int \rho _\epsilon (x)\ \mathrm{d}x=1\).
We now begin to set up our numerical procedure for estimating \(L_0\), which is a standard application of Ulam’s method (Ulam 1960). Let \(B_n = \{I_1,\dots , I_n\}\) denote an equipartition of [0, 1] into n subintervals, and set \({\mathcal {B}}_n = \) span\(\{{\mathbf {1}} _{I_1},\dots ,{\mathbf {1}}_{I_n}\}\). We define the (Ulam) projection \(\pi _n:L^2([0,1]) \rightarrow {\mathcal {B}}_n\) by \(\pi _n(g) = \sum _{i=1}^n\left( \frac{1}{m(I_i)} \int _{I_i}g(x)\mathrm{d}x\right) {\mathbf {1}}_{I_i}\). The finite-rank transfer operator \(L_{n}:=\pi _n L_0:L^2([0,1])\rightarrow {\mathcal {B}}_n\) can be computed numerically. We use MATLAB’s built-in functions integral.m and integral2.m to perform the \(\rho \)-convolution (using an explicit form of \(\rho _\epsilon \)) and the Ulam projections, respectively. Figure 1 displays the nonzero entries in the column-stochastic matrix corresponding to \(L_n\) for \(\epsilon =0.1\).
Approximations to the invariant probability densities for our stochastic dynamics are displayed in Fig. 2 (left) for large and small noise supports. A lower level of noise permits greater concentration of invariant probability mass near the fixed point \(x=0\) of the map \(T_0\). Also shown in Fig. 2 (right) are the estimated eigenfunctions corresponding to the second-largest eigenvalue of \(L_n\). The signs of these second eigenfunctions split the interval [0, 1] into left and right hand portions, broadly indicating that the slow mixing is due to positive mass near \(x=0\) and negative mass away from \(x=0\) (Dellnitz et al. 2000); see Froyland et al. (2011) for further discussion of this point in the Pomeau-Manneville setting.
8.1.1 Kernel Perturbations
In the framework of Problems A and B we use the (arbitrarily chosen) monotonically increasing observation function \(c(x)=-\cos (x)\). In order to estimate \({\dot{k}}\) as in (28) we use the code from Algorithm 3 (Antown et al. 2018); the inputs are the Ulam matrix \(L_n\) and \(c_n\) (obtained as \(\pi _n(c)\)). Equivalently, directly using (28) one may substitute \(f_n\) (obtained as the leading eigenvector of \(L_n\)) for f, \(L_n\) for L, \(c_n\) as above for c, and solve \((Id-L_n^*)^{-1}c_n\) (obtained as a vector \(y\in {\mathbb {R}}^n\) by numerically solving the linear system \((Id-L_n^*)y=c_n, f_n^\top y=0\)). Figure 3 shows the optimal kernel perturbations \({\dot{k}}_n\) for \(n=500\). Because c is an increasing function, intuitively one might expect the kernel perturbation to try to shift mass in the invariant density from left to right. Broadly speaking, this is what one sees in the high-noise case in Fig. 3 (left): vertical strips typically have red above blue, corresponding to a shift of mass to the right in [0, 1]. The main exception to this is around the y-axis value of 1/2, where red is strongly below blue along vertical strips. This is because at the next iteration, these red regions will be mapped near \(x=1\) and achieve the highest value of c, while the blue regions will be mapped near to \(x=0\) with the least value of c. In the low-noise case of Fig. 3 (right), we see a similar solution with higher spatial frequencies, and strong kernel perturbations near the critical values of \(x=0\) and \(T_0(x)=1/2\).
To investigate the optimal kernel perturbation to maximally increase the rate of mixing in the stochastic system, we use the expression \({\dot{k}}\) in (31). A natural approximate version (31) requires estimates of the left and right eigenfunctions of \(L_0\) corresponding to the second largest eigenvalue \(\lambda _2\); these are obtained directly as eigenvectors of \(L_n\). Figure 4 shows the resulting optimal kernel perturbations, computed using the code from Algorithm 4 (Antown et al. 2018) with input \(L_n\). Because the fixed point at \(x=0\) is responsible for the slow algebraic decay of correlations for the deterministic dynamics of \(T_0\), the fixed point will also play a dominant role in the mixing rate of the stochastic system for low to moderate levels of noise. Indeed, Fig. 4 shows that the optimal kernel perturbation concentrates its effort in a neighbourhood of the fixed point, and pushes mass away from the fixed point as much as possible. This is particularly extreme in the low noise case of Fig. 4 (right) with the perturbation almost exclusively concentrated in a small neighbourhood of \(x=0\).
8.1.2 Map Perturbations
We now turn to the problem of finding the unique map perturbation \({\dot{T}}\) that maximises the change in expectation of the observation \(c(x)=-\cos (x)\) (see Problem C for a precise formulation) and maximises the speed of mixing (see Problem D). We use the natural Ulam discretisation of the expressionFootnote 5 (49). The objects \(f_n\) and \((Id-L_n^*)^{-1}c_n\) are computed exactly as before in Sect. 8.1.1. The action of the operator \({\mathcal {G}}\) in (49) is computed using MATLAB’s built-in function integral.m using an explicit form of \(\mathrm{d}\rho _\epsilon /\mathrm{d}x\) for \(\mathrm{d}\rho /\mathrm{d}x\) in (49).
Figure 5 (left) shows the optimal \({\dot{T}}\) for the two noise amplitudes \(\epsilon =1/10\) and \(\epsilon =\sqrt{6}/100\). Note that for the noise amplitude \(\epsilon =0.1\) (blue curve in Fig. 5) the map perturbation \({\dot{T}}\) is mostly positive, corresponding to moving probability mass to the right, as expected because we are maximising the change in expectation of an increasing observation function c. The blue curve is most negative in neighbourhoods of the two preimages of \(x=1/2\), corresponding to moving probability mass to the left. The reason for this is identical to the discussion of the “blue above red” effect in Fig. 3, namely moving mass to the left creates a very large increase in the objective function value at the next iterate. This “look ahead” effect is even more pronounced in the low noise case (red curve of Fig. 5), where \({\dot{T}}\) is mostly positive, but has deep negative map perturbations at multiple preimages of \(x=1/2\) reaching further into the past.
Figure 5 (right) illustrates the Pomeau-Manneville map (black) with perturbed maps \(T_0+{\dot{T}}/100\). We have chosen a scale factor of 1/100 for visualisation purposes; one should keep in mind we have optimised for an infinitesimal change in the map. Figure 6 shows the kernel derivatives \({\dot{k}}\) corresponding to the optimal map derivatives \({\dot{T}}\) for the two noise levels. These kernel derivatives have a restricted form because they arise purely from a derivative in the map. One may compare Fig. 6 with Fig. 3 and note that the kernel derivative in Fig. 6 (left) attempts to follow the general structure of the kernel derivative in Fig. 3 (left), while obeying its structural restrictions arising from the less flexible map perturbation. Broadly speaking, in Fig. 6 (left), red lies above blue (mass is shifted to the right). Exceptions are near \(y=1/2\) because at the next iteration these red points will land near \(x=1\), achieving very high objective value, while the blue region will get mapped to near \(x=0\), encountering the lowest value of c. Note that the map perturbation decreases from a peak to very close to zero near \(x=0\). This is because in a small neighbourhood of \(x=0\) there is already some stochastic perturbation away from \(x=0\) “for free” due to the reflecting boundary conditions imposed by \(\pi \). Thus, the map perturbation \({\dot{T}}\) does not need to invest energy in large perturbations very close to \(x=0\).
The map perturbation that maximally increases the rate of mixing is a particularly interesting question. Our computations use the natural Ulam discretisation of (56). The computations follow as in Sect. 8.1.1 with the action of \({\mathcal {G}}\) computed as above. Figure 7 (left) shows the optimal \({\dot{T}}\) for the two noise amplitudes \(\epsilon =1/10\) and \(\epsilon =\sqrt{6}/100\). A sharp map perturbation away from \(x=0\) is seen for both noise levels, with the perturbation sharper for the lower noise case. In both cases, the map perturbations far from \(x=0\) are weak (low magnitude values of \({\dot{T}}\)). This result corresponds well with the results seen for the optimal kernel perturbations in Fig. 4, where mass was primarily moved away from \(x=0\). As in the optimal solution shown in Fig. 5 (left), the optimal map perturbation in Fig. 7 decreases from a sharp peak down to zero near \(x=0\). This is again because in a small neighbourhood of \(x=0\) the system experiences “free” stochastic perturbations away from \(x=0\) due to the reflecting boundary conditions, and thus the map perturbation \({\dot{T}}\) need not need invest energy in large perturbations very close to \(x=0\). Figure 7 (right) illustrates the Pomeau-Manneville map (black) with perturbed maps \(T_0+{\dot{T}}/100\), where again the factor 1/100 is just for illustrative purposes; we are optimising an infinitesimal map perturbation. When inspecting the kernel derivatives \({\dot{k}}\) corresponding to the optimal map perturbations \({\dot{T}}\) in Fig. 8, we see similar behaviour to those in Fig. 7.
8.2 Interval Exchange Map
In our second example, we consider a weak-mixing interval exchange map. This is because of an existing literature in mixing optimisation for these classes of maps with the addition of noise. Avila and Forni (2007) prove that a typical interval exchange is either weak mixing or an irrational rotation. We use a specific weak-mixing (Sinai and Ulcigrai 2005) interval exchange map \(T_0\) with interval permutation \((1234)\mapsto (4321)\) and interval lengths given by the normalised entries of the leading eigenvector of the matrix \(\left( \begin{array}{cccc} 13&{}37&{}77&{}47\\ 10&{}30&{}60&{}37\\ 3&{}10&{}24&{}14\\ 4&{}10&{}19&{}12 \end{array} \right) \); see equation (51) in Sinai and Ulcigrai (2005). We again form a stochastic system using the same noise kernels as for the Pomeau-Manneville map in Sect. 8.1. The mixing properties of this map have been studied in Froyland et al. (2016). Figure 9 shows the column-stochastic matrix corresponding to \(L_n\) for \(n=500\) and \(\epsilon =0.1\).
8.2.1 Kernel Perturbations
In the framework of Problem A, we use the same observation function \(c(x)=-\cos (x)\) as in the Pomeau-Manneville case study, and estimate the optimal kernel perturbation \({\dot{k}}\) that maximally increases the expectation of c in an identical fashion. In broad terms, one again sees that \({\dot{k}}\) attempts to shift invariant probability mass to the right in [0, 1]. In Fig. 10 (left), in each smooth part of the support of \({\dot{k}}\), red is “above” blue, meaning mass is pushed to the right.
Clear exceptions to the “red above blue” scheme are seen as three sharp horizontal lines. The y-coordinates of these three sharp horizontal lines coincide with the three points of discontinuity in the domain of the interval exchange at approximately \(x=0.43, 0.77, 0.89\). Consider the sharp horizontal “blue above red” line at \(y\approx 0.43\). According to Fig. 9, under the action of the kernel \(k_0\), mass in the vicinity of \(x=0.6\) will be transported near to \(x=0.43\). The perturbation \({\dot{k}}\) shown in Fig. 10 will then tend to push this mass to the left of \(x=0.43\). Thus, on the next iteration there will be a bias for mass to be mapped near to \(x=1\) rather than near \(x=0.25\), achieving a much larger objective value at this iterate. A similar reasoning applies to the “blue above red” horizontal lines at \(y\approx 0.77\) and 0.89; the contrast is a little weaker because the potential gain at the next iterate is also weaker. In the low noise case, Fig. 10 (right), displays similar behaviour to the higher noise case of Fig. 10 (left). With lower noise, the deterministic dynamics plays a greater role and additional preimages are taken into account, leading to a more oscillatory optimal \({\dot{k}}\).
To investigate the optimal kernel perturbation to maximally increase the rate of mixing in the stochastic system (in the framework of Problem B) we use the expression \({\dot{k}}\) in (31). The method of numerical approximation is identical to that used for the Pomeau-Manneville map. Figure 11 shows the signed distribution of mass that is responsible for the slowest realFootnote 6 exponential rate of decay in the stochastic system. This eigenfunction becomes more oscillatory as the level of noise decreases, and as must be the case, the magnitude of the corresponding eigenvalue increases from \(\lambda \approx 0.7476\) (\(\epsilon =1/10\)) to \(\lambda \approx 0.9574\) (\(\epsilon =\sqrt{6}/100\)). Because the sign of these eigenvalues is negative, one expects a pair of almost-2-cyclic sets (Dellnitz and Junge 1999), consisting of three subintervals each, given by the positive and negative supports of the eigenfunctions.
Figure 12 shows the approximate optimal kernel perturbations. In the high-noise situation of Fig. 12 (left), the sharp horizontal changes are present at preimages of the deterministic dynamics, as they were in to Fig. 10 (left). The importance of the break points to the overall mixing rate is thus clearly borne out in the optimal \({\dot{k}}\); a precise interpretation of the optimal \({\dot{k}}\) is not very straightforward. For the low noise case (Fig. 12 (right)) it appears that there is an alternating shifting of mass left and right with alternating “red above blue” and “blue above red”. This leads to greater mixing at smaller spatial scales than is possible in a single iteration of the deterministic interval exchange. We anticipate that decreasing the noise amplitude further will result in more rapid alternation of “red above blue” and “blue above red”. As the diffusion amplitude decreases, the efficient large-scale diffusive mixing is no longer possible and so a transition is made to small-scale mixing, accessed by increasing oscillation in the kernel.
8.2.2 Map Perturbations
The computations in this section follow those of Sect. 8.1.2. Figure 13 (left) shows the optimal map perturbations \({\dot{T}}\) at two different noise levels. Figure 13 (right) illustrates \(T_0+{\dot{T}}/100\) for the two different levels of noise. The kernel perturbations generated by these optimal map perturbations are displayed in Fig. 14. If one compares the kernel perturbations in Fig. 14 with those more flexible kernel perturbations in Fig. 10, one sees that the two sets of kernel perturbations are broadly equivalent with one another in terms of the relative positions of the positive and negative (red and blue) perturbations. Note that the more restrictive kernel derivative in Fig. 14 by construction cannot replicate the sharp horizontal red-blue switches in Fig. 10. It turns out that the strongest of these red-blue switches, namely the one at \(y\approx 0.43\) in Fig. 10 (left) is approximated as best as is allowed by a map perturbation, see Fig. 14 (left), while the other two (weaker) horizontal red/blue switches seen in 10 are ignored.
We now turn to optimal map perturbations for the mixing rate. The combined effect of the “cutting and shuffling” of interval exchanges with diffusion on mixing rates has been widely studied, e.g. Ashwin et al. (2002), Sturman (2012), Froyland et al. (2016), Kreczak et al. (2017), Wang and Christov (2018), including investigations of the impact of changing the diffusion or the interval exchange on mixing. The very general type of formal map optimisation we consider here has not been attempted before, and we hope that our novel techniques will stimulate interesting new research questions and motivate more sophisticated experiments in the field of mixing optimisation.
Under repeated iteration, the original interval exchange \(T_0\) cuts and shuffles the unit interval into an increasing number of smaller pieces, assisting the small scale mixing of diffusion. Our results in Fig. 15 (left) show an oscillatory \({\dot{T}}\), with increasing oscillations as the noise amplitude decreases. This increased oscillation effect is also seen when comparing the left and right panes of Fig. 16. Thus, the optimisation attempts to include some additional mixing by rapid local warping of the phase space. It is plausible that this additional warping effect enhances mixing beyond the rigid shuffling of the interval exchange. An illustration of \(T_0+{\dot{T}}/100\) is given in Fig. 15. We emphasise that the factor 1/100 is only for visualisation purposes and for smaller factors, the perturbed map would remain a piecewise homeomorphism (modulo small overshoots at the boundaries, which are taken care of by the reflecting boundary conditions on the noise).
Notes
We use the notation \({\mathbf {1}}\) for the constant function and \({\mathbf {1}}_A\) for the indicator function of the set A.
We will also denote \(L^p:=L^p([0,1],{{\mathbb {R}}})\); this notation will not be used for \(L^{2}([0,1],{{\mathbb {C}}})\).
The relative interior of a closed convex set C is the interior of C relative to the closed affine hull of C, see e.g. Borwein and Goebel (2003).
We say T is covering if for each small open interval \(I\subseteq [0,1]\) there is \(n=n(I)\) such that \(T^n(I)=[0,1]\).
Note that since \(T_0^{-1}(\{0,1\})\) is a finite set, we may take \(\ell >0\) as small as we like. In the computations we set \(\ell =0\), so that \({\widetilde{F}}_\ell =[0,1]\) mod m.
In our numerical experiments the largest magnitude real eigenvalue appears as the sixth (resp. fourth) eigenvector of \(L_{500}\) for \(\epsilon =1/10\) (resp. \(\epsilon =\sqrt{6}/100\)). Slightly larger complex eigenvalues are present—with magnitudes 0.8411 and 0.9609 respectively—but we do not investigate these in order to make the dynamic interpretation more straightforward.
References
Antown, F., Dragičević, D., Froyland, G.: Optimal linear responses for Markov chains and stochastically perturbed dynamical systems. J. Stat. Phys. 170(6), 1051–1087 (2018)
Ashwin, P., Nicol, M., Kirkby, N.: Acceleration of one-dimensional mixing by discontinuous mappings. Phys. A Stat. Mech. Appl. 310(3–4), 347–363 (2002)
Avila, A., Forni, G.: Weak mixing for interval exchange transformations and translation flows. Ann. Math. 66, 637–664 (2007)
Bahsoun, W., Ruziboev, M., Saussol, B.: Linear response for random dynamical systems. Adv. Math. 364, 107011 (2020)
Baladi, V.: Linear response, or else. ICM Seoul 2014 talk (2014). arXiv:1408.2937
Bonnans, J., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, Berlin (2013)
Borwein, J., Goebel, R.: Notions of relative interior in Banach spaces. J. Math. Sci. 115(4), 66 (2003)
Conway, J.: A Course in Functional Analysis, vol. 96. Springer, Berlin (2013)
Dellnitz, M., Junge, O.: On the approximation of complicated dynamical behavior. SIAM J. Numer. Anal. 36(2), 491–515 (1999)
Dellnitz, M., Froyland, G., Sertl, S.: On the isolated spectrum of the Perron–Frobenius operator. Nonlinearity 13(4), 1171 (2000)
Dragicevic, D., Sedro, J.: Statistical stability and linear response for random hyperbolic dynamics (2020). arXiv:2007.06088
Eveson, S.: Compactness criteria for integral operators in \(L^\infty \) and \(L^1\) spaces. Proc. Am. Math. Soc. 123(12), 3709–3716 (1995)
Faggionato, A., Gantert, N., Salvi, M.: Einstein relation and linear response in one-dimensional Mott variable-range hopping. Ann. Inst. H. Poincaré Probab. Stat. 55(3), 1477–1508 (2019)
Froyland, G.: An analytic framework for identifying finite-time coherent sets in time-dependent dynamical systems. Phys. D 250, 1–19 (2013)
Froyland, G., Santitissadeekorn, N.: Optimal mixing enhancement. SIAM J. Appl. Math. 77(4), 1444–1470 (2017)
Froyland, G., Murray, R., Stancevic, O.: Spectral degeneracy and escape dynamics for intermittent maps with a hole. Nonlinearity 24(9), 2435 (2011)
Froyland, G., González-Tokman, C., Watson, T.: Optimal mixing enhancement by local perturbation. SIAM Rev. 58(3), 494–513 (2016)
Froyland, G., Koltai, P., Stahn, M.: Computation and optimal perturbation of finite-time coherent sets for aperiodic flows without trajectory integration. SIAM J. Appl. Dyn. Syst. 19(3), 1659–1700 (2020)
Galatolo, S.: Quantitative statistical stability and speed of convergence to equilibrium for partially hyperbolic skew products. J. Éc. Pol. Math. 5, 377–405 (2018)
Galatolo, S., Giulietti, P.: A linear response for dynamical systems with additive noise. Nonlinearity 32(6), 2269 (2019)
Galatolo, S., Pollicott, M.: Controlling the statistical properties of expanding maps. Nonlinearity 30, 2737–2751 (2017)
Galatolo, S., Sedro, J.: Quadratic response of random and deterministic dynamical systems. Chaos 30(2), 023113 (2020)
Gantert, N., Mathieu, P., Piatnitski, A.: Einstein relation for reversible diffusions in random environment. Comm. Pure Appl. Math. 65(2), 187–228 (2012)
Gantert, N., Guo, X., Nagel, J.: Einstein relation and steady states for the random conductance model. Ann. Probab. 45(4), 2533–2567 (2017)
Ghil, M., Lucarini, V.: The physics of climate variability and climate change. Rev. Mod. Phys. 92(3), 035002 (2020)
Gouëzel, S., Liverani, C.: Banach spaces adapted to Anosov systems. Ergod. Theory Dyn. Syst. 26, 189–217 (2006)
Hairer, M., Majda, A.: A simple framework to justify linear response theory. Nonlinearity 23, 909–922 (2010)
Hennion, H., Hervé, L.: Limit Theorems for Markov Chains and Stochastic Properties of Dynamical Systems by Quasi-compactness, vol. 1766. Springer, Berlin (2001)
Kato, T.: Perturbation Theory for Linear Operators. Reprint of the 1980 edition. Classics in Mathematics. Springer, Berlin (1995)
Kloeckner, B.R.: The linear request problem. Proc. Am. Math. Soc. 146, 2953–2962 (2018)
Kolmogorov, A., Fomin, S.: Elements of the Theory of Functions and Functional Analysis. Volume 2: Measure. The Lebesgue Integral. Hilbert Space. Graylock (1961)
Koltai, P., Lie, H.C., Plonka, M.: Fréchet differentiable drift dependence of Perron-Frobenius and Koopman operators for non-deterministic dynamics. Nonlinearity 32(11), 4232 (2019)
Komorowski, T., Olla, S.: On mobility and Einstein relation for tracers in time-mixing random environments. J. Stat. Phys. 118(3/4), 407–435 (2005)
Kreczak, H., Sturman, R., Wilson, M.C.: Deceleration of one-dimensional mixing by discontinuous mappings. Phys. Rev. E 96(5), 053112 (2017)
Lasota, A., Mackey, M.: Probabilistic Properties of Deterministic Systems. Cambridge University Press, Cambridge (1985)
Liverani, C., Saussol, B., Vaienti, S.: A probabilistic approach to intermittency. Ergod. Theory Dyn. Syst. 19(3), 671–685 (1999)
Luenburger, D.: Optimization by Vector Space Methods. Wiley, New York (1969)
MacKay, R.: Management of complex dynamical systems. Nonlinearity 31, R52–R66 (2018)
Marangio, L., Sedro, J., Galatolo, S., Di Garbo, A., Ghil, M.: Arnold maps with noise: differentiability and non-monotonicity of the rotation number. J. Stat. Phys. 6, 66 (2019)
Mathieu, P., Piatnitski, A.: Steady states, fluctuation-dissipation theorems and homogenization for diffusions in a random environment with finite range of dependence. Arch. Rational Mech. Anal. 230(3/4), 277–320 (2018)
Reed, M., Simon, B.: Methods of Modern Mathematical Physics. Volume I: Functional Analysis. Academic Press, London (1980)
Ruelle, D.: Differentiation of SRB states. Commun. Math. Phys. 187, 227–241 (1997)
Sedro, J.: On Regularity Loss in Dynamical Systems. PhD thesis (2019). https://www.theses.fr/2018SACLS254
Sedro, J., Rugh, H.H.: Regularity of characteristic exponents and linear response for transfer operator cocycles (2020). arXiv:2004.10103
Sinai, Y., Ulcigrai, C.: Weak mixing in interval exchange transformations of periodic type. Lett. Math. Phys. 74(2), 111–133 (2005)
Sturman, R.: The role of discontinuities in mixing. Adv. Appl. Mech. 45, 51–90 (2012)
Ulam, S.: A Collection of Mathematical Problems, vol. 8. Interscience Publishers, New York (1960)
Wang, M., Christov, I.C.: Cutting and shuffling with diffusion: evidence for cut-offs in interval exchange maps. Phys. Rev. E 98(2), 022221 (2018)
Zmarrou, H., Homburg, A.: Bifurcations of stationary measures of random diffeomorphisms. Ergod. Theory Dyn. Syst. 27, 1651–1692 (2007)
Acknowledgements
FA is supported by a UNSW University Postgraduate Award. GF is partially supported by an ARC Discovery Project. FA and GF thank the Department of Mathematics at the University of Pisa for generous support and hospitality. SG is partially supported by the research project PRIN 2017S35EHN_004 “Regular and stochastic behaviour in dynamical systems” of the Italian Ministry of Education and Research.
Funding
Open Access funding enabled and organized by CAUL and its Member Institutions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Oliver Junge.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proof of Theorem 5.4
First we need a technical lemma. We note that the statement of the lemma is analogous to the continuity of \((\text {Id}-L_0)^{-1}\), which was treated in the proof of Theorem 2.2.
Lemma A.1
Consider the closed subspace span\(\{f_0\}^\perp \subset L^2\) equipped with the \(L^2\) norm. Then, the operator \((\text {Id}-L_0^*)^{-1}:\) span\( \{f_0\}^\perp \rightarrow \) span\(\{f_0\}^\perp \) is bounded.
Proof
We begin by finding the kernel and range of the operator \(\text {Id}-L_0^*\). Recall that \(L_0(V)\subset V\) and that \(L_0\) preserves a one-dimensional eigenspace span\(\{f_0\}\), with eigenvalue 1. Thus, we have \(\ker (\text {Id}-L_0)=\) span\(\{f_0\}\) and ran\((\text {Id}-L_0)\subset V\). Recalling that \(L_0:V\rightarrow V\) is compact and \(f_0\not \in V\), we have by the Fredholm alternative (see Dragicevic and Sedro 2020, VII.11) that for any \(g\in V\), there exists a unique \(h\in V\) such that \(g=(\text {Id}-L_0)h\). Hence, ran(\(\text {Id}-L_0)=V\). Since V is closed, the range of \(\text {Id}-L_0\) is closed and so, by the Closed Range Theorem (Theorem 5.13, IV-§5.2, Kato 1995), we have \(\text {ran}((\text {Id}-L_0)^*) = \ker (\text {Id}-L_0)^\perp =\) span\(\{f_0\}^\perp \), which is a co-dimension 1 space, and \(\ker ((\text {Id}-L_0)^*)=\) ran(\(\text {Id}-L_0)^\perp = V^\perp =\) span\(\{{\mathbf {1}}\}^{\perp \perp }=\) span\(\{{\mathbf {1}}\}\), where the last equality follows from Corollary 1.41 in III-§1.8 (Kato 1995) and the fact that span\(\{{\mathbf {1}}\}\) is a finite-dimensional closed subspace of \(L^2\).
To prove that \((\text {Id}-L_0^*)^{-1}:\) span\( \{f_0\}^\perp \rightarrow \) span\(\{f_0\}^\perp \) is bounded, we will use the Inverse Mapping Theorem (Theorem III.11, Reed and Simon 1980). Since the integral operator \(L_0^*\) has an \(L^2\) kernel, by (6) and the triangle inequality it follows that \(\text {Id}-L_0^*\) is bounded. Also, from the Fredholm alternative argument above, \(\text {Id}-L_0^*:{\mathrm {span}}\{f_0\}^\perp \rightarrow {\mathrm {span}}\{f_0\}^\perp \) is surjective. Thus, to apply the Inverse Mapping Theorem, we just need to show that \(\text {Id}-L_0^*\) is injective on span\(\{f_0\}^\perp \). Let \(f_1,f_2\in \) span\( \{f_0\}^\perp \) be such that \((\text {Id}-L_0^*)f_1=(\text {Id}-L_0^*)f_2\). Thus, \(f_1-f_2\in \ker (\text {Id}-L_0^*) =\) span\(\{{\mathbf {1}}\}\) and so \(f_1-f_2 = \gamma {\mathbf {1}}\) for some \(\gamma \in {{\mathbb {R}}}\). Since \(f_1-f_2\in \) span\(\{f_0\}^\perp \), we have that \(0 = \int (f_1(x)-f_2(x))f_0(x)\mathrm{d}x=\gamma \int f_0(x)\mathrm{d}x\) and so \( \gamma =0\) (since \(\int f_0(x)\mathrm{d}x =1\)), i.e. \(f_1=f_2\); thus, \((\text {Id}-L_0^*)\) is injective and the result follows. \(\square \)
Proof of Theorem 5.4
We will use the method of Lagrange multipliers to derive the expression (28) from the first-order necessary conditions for optimality and then show that such a \({\dot{k}}\) satisfies the second-order sufficient conditions. To this end, we consider the following Lagrangian function
where \(f({\dot{k}}):=-\big \langle c,R({\dot{k}})\big \rangle _{L^{2}([0,1],{{\mathbb {R}}})},\) \(g( {\dot{k}}):=\Vert {\dot{k}}\Vert _{L^{2}([0,1]^{2})}^{2}-1\) and \({\dot{k}}\in V_{\ker }\cap S_{k_0,l}\).
Necessary conditions: We verify the conditions in Theorem 2, §7.7, (Luenburger 1969). We want to find \({\dot{k}}\) and \(\mu \) that satisfy the first-order necessary conditions:
where \(D_{{\dot{k}}}{\mathcal {L}}({\dot{k}},\mu )\in {\mathcal {B}} (L^{2}([0,1]^{2}),{{\mathbb {R}}})\) is the Frechet derivative with respect to the variable \({\dot{k}}\). Since f is linear, we have \((D_{{\dot{k}}}f){\tilde{k}}=f({\tilde{k}})\). Also, \( (D_{{\dot{k}}}g){\tilde{k}}=2\langle {\dot{k}},{\tilde{k}}\rangle _{L^{2}([0,1]^{2})} \) since
Thus, for the necessary conditions of the Lagrange multiplier method to be satisfied, we need that
for all \({\tilde{k}}\in V_{\ker }\cap S_{k_0,l}\) and
Noting Lemma A.1 and the fact that \(c\in \) span\(\{f_0\}^\perp \), we have
We claim that
satisfies the necessary condition (58) and lies in \(V_{\ker }\cap S_{k_0,l}\). Before we verify this, we show that
where \({\hat{g}}(y):=\frac{1}{m(F_l^y)}\int _{F_l^y}((\text {Id}-L_{0}^{*})^{-1}c)(z)\mathrm{d}z\), is in \(L^2([0,1]^2)\). Since \(f_0,(\text {Id}-L_{0}^{*})^{-1}c\in L^2\), we just need to show that \({\mathbf {1}}_{F_l}(x,y)f_0(y){\hat{g}}(y)\) is in \(L^2([0,1]^2)\). First, we note that
and therefore
We then have
Thus, \({\mathbf {1}}_{F_l}(x,y)f_0(y){\hat{g}}(y)\) is in \(L^2([0,1]^2)\) and therefore \(M\in L^2([0,1]^2)\).
Now, to verify \({\dot{k}}\) satisfies (58), we compute, for \({\tilde{k}}\in V_{\ker }\cap S_{k_0,l}\),
where the last equality follows from \({\tilde{k}}\in V_{\ker }\cap S_{k_0,l}\). To conclude checking that \({\dot{k}}\) satisfies the necessary condition (58), we need to check that \(\mu \ne 0\). Since \(M\in L^2([0,1]^2)\), note that the necessary condition (59) yields \(\mu =\pm \frac{1}{2}\Vert M\Vert _{L^2([0,1]^2)}\); thus, to finish the proof that \({\dot{k}}\) satisfies both necessary conditions (58)-(59), we will show that \(\Vert M\Vert _{L^2([0,1]^2)}\ne 0\). From the hypotheses on \(f_0\) and \((\text {Id}-L_{0}^{*})^{-1}c\) we conclude that
Hence, \(\mu =\pm \frac{1}{2}\Vert M\Vert _{L^2([0,1]^2)}\ne 0\). The sign of \(\mu \) is determined by checking the sufficient conditions.
We can now verify that \({\dot{k}}\in V_{\ker }\cap S_{k_0,l}\). We note from \(M\in L^2([0,1]^2)\) and \(\mu \ne 0\) that \({\dot{k}}\in L^2([0,1]^2)\). By construction supp\(({\dot{k}})\subseteq F_l\). Finally, we have
Sufficient conditions: We want to show that \({\dot{k}}\) in (28) is a solution to the optimisation problem (26)- (27) by checking that it satisfies the second-order sufficient conditions. We first demonstrate the set of Lagrange multipliers \(\Lambda ({\dot{k}})\) (in Definition 3.8, §3.1 Bonnans and Shapiro 2013) is not empty in our setting; this will enable us to use the second-order sufficient conditions of Lemma 3.65 (Bonnans and Shapiro 2013). Note that in terms of the notation used in Bonnans and Shapiro (2013) versus our notation, \(Q=X=V_{\ker }\cap S_{k_0,l}\), \(x_{0}={\dot{k}}\), \( Y^{*}={{\mathbb {R}}}\), \(G(x_{0})=g({\dot{k}})\), \(K=\{0\}\), \(N_{K}(G(x_{0}))={ {\mathbb {R}}} \), \(T_{K}(G(x_{0}))=\{0\}\) and \(N_{Q}(x_{0})=\{0\}\) (since \(Q=X\), see discussion in §3.1 following Definition 3.8). Thus, to show that \( \Lambda ({\dot{k}})\) is not empty, we need to show that \({\dot{k}}\) and \(\mu \) satisfy
where \(\{0\}^{-}:=\{a \in {\mathbb {R}}:a x\le 0\ \forall x\in \{0\}\}=\mathbb { R}\) (this simplification of conditions (3.16) in Bonnans and Shapiro (2013) follows from the discussion following Definition 3.8 in §3.1 and the fact that \(\{0\}\) is a convex cone). Since the second condition in (61) implies the fourth, and since \(\mu \in {{\mathbb {R}}}\), we only need to check the first two equalities in (61). However, these two conditions are implied from the first-order necessary conditions. Hence, \( \Lambda ({\dot{k}})\) is not empty and thus, to show that \({\dot{k}}\) is a solution to (26)-(27), we need to show that it satisfies the following second-order conditions (see Lemma 3.65): there exists constants \(\nu >0\), \(\eta >0\) and \(\beta >0\) such that
where \(C_{\eta }({\dot{k}}):=\big \{v\in V_{\ker }\cap S_{k_0,l}:|2\langle \dot{ k},v\rangle _{V_{\ker \cap S_{k_0,l}}}|\le \eta \Vert v\Vert _{V_{\ker }\cap S_{k_0,l}}\text { and }f(v)\le \eta \Vert v\Vert _{V_{\ker }\cap S_{k_0,l}} \big \}\) is the approximate critical cone (see equation (3.131) in §3.3 Bonnans and Shapiro 2013). Since \(D_{{\dot{k}}}{\mathcal {L}}({\dot{k}},\mu ){\tilde{k}}=f({\tilde{k}} )+2\mu \langle {\dot{k}},{\tilde{k}}\rangle _{L^{2}([0,1]^{2})}\) and \( \langle {\dot{k}},{\tilde{k}}\rangle _{L^{2}([0,1]^{2})}\) is linear in \({\dot{k}}\) , we have that \(D_{{\dot{k}}{\dot{k}}}^{2}{\mathcal {L}}({\dot{k}},\mu )({\tilde{k}} ,{\tilde{k}})=2\mu \langle {\tilde{k}},{\tilde{k}}\rangle _{L^{2}([0,1]^{2})}\) . Thus, we conclude that the second-order condition (62) holds with \(\mu >0\), \(\nu =|\mu |=\frac{1}{2}\Vert M\Vert _{V_{\ker }\cap S_{k_0,l}}\), \(\beta =2\mu \) and \(\eta =\max \big \{2\Vert {\dot{k}} \Vert _{V_{\ker }\cap S_{k_0,l}},\Vert c\Vert _2\Vert f_0\Vert _2\Vert (\text {Id}-L_0)^{-1}\Vert _{V\rightarrow V} \big \}\). Since \({\dot{k}}\) satisfies the necessary conditions (58) and (59) with \(\mu >0\), we conclude that \({\dot{k}}\) is a solution to the optimisation problem (26)- (27).
Uniqueness of the solution: The set \(P_l=V_{\ker }\cap S_{k_0,l}\cap B_1\) is a closed (Lemma 5.1), bounded, strictly convex set, containing \({\dot{k}}=0\). The objective \({\mathcal {J}}({\dot{k}})=\langle c,R({\dot{k}})\rangle \) is continuous (since \({\mathcal {J}}\) is linear and R is continuous (see comment following (14))) and not uniformly vanishing (Lemma 5.2). Therefore by Propositions 4.1 and 4.3, \({\dot{k}}\) is the unique optimum.
\(L^\infty \) boundedness of the solution: Suppose that \(c\in W\) and \(k_0\in L^\infty ([0,1]^2)\). From \(L_0f_0=f_0\) and \(k_0\in L^\infty ([0,1]^2)\), we have by (7) that \(f_0\in L^\infty \). Let \(V_1 := \{f\in L^1:\int f\ \mathrm{d}m = 0\}\). We would like to show that \((\text {Id}-L_0)^{-1}:V_1\rightarrow V_1\) is bounded. To obtain this, we first need the exponential contraction of \(L_0\) on \(V_1\). Since \(L_0\) is integral preserving and compact on \(L^1\), from the argument in the proof of Theorem 2.2 we only need to verify the \(L^1\) version of assumption (A1) on \(V_1\). To verify this, we note that for \(h\in V_1\), we have \(\Vert L_0h\Vert _2\le \Vert L_0 h\Vert _\infty \le \Vert k_0\Vert _{L^\infty ([0,1]^2)}\Vert h\Vert _1\) and therefore, \(L_0h\in V\) since \(L_0\) preserves the integral. Thus, for any \(h\in V_1\), \(\lim _{n\rightarrow \infty }\Vert L_0^nh\Vert _1\le \lim _{n\rightarrow \infty }\Vert L_0^{n-1}(L_0h)\Vert _2=0\) since \(L_0\) satisfies (A1) on V. Hence, the \(L^1\) version of (A1) holds and \(L_0\) has exponential contraction on \(V_1\). We then have
where the last inequality follows from \(\lambda <0\); thus, (\(\text {Id}-L_0)^{-1}:V_1\rightarrow V_1\) is bounded.
Next we would like to find the subspace where the operator (\(\text {Id}-L_0^*)^{-1}\) is bounded. We will replicate the result of Lemma A.1, however, (\(\text {Id}-L_0)^{-1}\) is now acting on \(L^1\), so we note that for a subspace \({\mathcal {S}}\) of \(L^1\), we have that
where we are using the fact that \((L^1)^*=L^\infty \). Also, \({\mathcal {S}}^{\perp }\) is a closed subspace of \(L^\infty \) (see III-§1.4, Kato 1995).
Now, as in the proof of Lemma A.1, we have \(\ker (\text {Id}-L_0)=\) span\(\{f_0\}\) and ran(\(\text {Id}-L_0)=V_1\). We also have \(\text { ran((Id}-L_0)^*)=\) span\(\{f_0\}^{\perp } =\{h\in L^\infty :\int h(x)f_0(x)\mathrm{d}x = 0\}=:W\) and \(\ker ((\text {Id} -L_0)^*)=V_1^{\perp }=\{h\in L^\infty :\int h(x)w(x)\mathrm{d}x = 0\ \forall \ w\in V_1\}\). Next, for \(h\in W\), we have
thus, (\(\text {Id}-L_0^*)(W)\subset W\). We again, as in Lemma A.1, apply the Inverse Mapping Theorem to prove that (\(\text {Id}-L_0^*)^{-1}: W\rightarrow W\) is bounded. From (7), and the triangle inequality, the operator \(\text {Id}-L_0^*: W\rightarrow W\) is bounded. Noting that \(V_1\) is a closed co-dimension 1 subspace of \(L^1\), we have codim\((V_1)=\) dim\((V_1^\perp )\) (see Lemma 1.40 III-§1.8 Kato 1995); hence, dim\((\ker (\text {Id}-L_0^*))= \) dim\((V_1^\perp )=\) codim\((V_1)= 1\) and therefore, 1 is a geometrically simple eigenvalue of \(L_0^*\). Thus, \(\ker (\text {Id}-L_0^*)=\) span\(\{{\mathbf {1}}\}\) because \(L_0^*{\mathbf {1}}={\mathbf {1}}\). Since \(\int f_0\ \mathrm{d}m =1\), \({\mathbf {1}}\not \in \) span\(\{f_0\}^\perp \) and so, by the Fredholm alternative, \(\text {Id}-L_0^*\) is a bijection on W. Hence, by the Inverse Mapping Theorem, (\(\text {Id}-L_0^*)^{-1}\) is bounded on W. Since \(c\in W\), we have \(\Vert (\text {Id}-L_0^*)^{-1}c\Vert _\infty <\infty \).
To conclude the proof, we now show that \( {\hat{g}}(y) :=\frac{1}{m(F_l^y)}\int _{F_l^y}((\text {Id}-L_{0}^{*})^{-1}c)(z)\mathrm{d}z\) is in \(L^\infty \). We compute
Since \((\text {Id}-L_0^*)^{-1}c\in L^{\infty }\), we conclude that \({{\hat{g}}}\in L^\infty \); thus, \({\dot{k}}\in L^\infty ([0,1]^2)\). \(\square \)
Appendix B: Proof of Theorem 5.6
Proof
The optimisation problem is very similar to that considered in Theorem 5.4; thus, we will refer to the proof of that theorem with the following modifications.
Consider the Lagrangian function
where, in this setting, we have \(f({\dot{k}})=\langle {\dot{k}},E\rangle _{L^{2}([0,1]^{2},{{\mathbb {R}}})}\) and \(g({\dot{k}})=\Vert {\dot{k}}\Vert _{L^{2}([0,1]^{2},{{\mathbb {R}}})}^{2}-1\). Thus, for the necessary conditions of the Lagrange multiplier method to be satisfied, we need that
for all \({\tilde{k}}\in V_{\ker }\cap S_{k_0,l}\) and
We claim that
satisfies the necessary condition (65), and lies in \(V_{\ker }\cap S_{k_0,l}\). Before we verify this, we will show that
where \(h(y):=\frac{1}{m(F^y_l)}\int _{F^y_l}E(x,y)\mathrm{d}x\), is in \(L^2([0,1]^2)\). Since \(E\in L^2([0,1]^2)\), we just need to show that \({\mathbf {1}}_{F_l}(x,y)h(y)\) is in \(L^2([0,1]^2)\). We have
Substituting (18) into h, the terms in \(h(y)^2\) are a linear combination of functions of the form \({\tilde{g}}_{i_1}(y){\tilde{g}}_{i_2}(y)f_{i_3}(y)f_{i_4}(y)\), \(i_1,\ldots ,i_4\in \{1,\ldots ,4\}\) where \(f_j = \Re ({\hat{e}}),\Re (e),\Im ({\hat{e}})\) or \(\Im (e)\), \(j=1,\ldots ,4\), respectively, and \({\tilde{g}}_j(y) = \frac{1}{m(F_l^y)}\int _{F_l^y}f_j(x)\mathrm{d}x\), \(j=1,\ldots ,4\). Thus, to show \({\mathbf {1}}_{F_l}(x,y)h(y)\) is in \(L^2([0,1]^2)\) (and therefore \(M\in L^2([0,1]^2)\)), we need to bound
We note that
Thus, we have
Since \(f_j\in L^2\), \(j=1,\ldots ,4\), we conclude that \(M\in L^2([0,1]^2)\).
Now, to verify \({\dot{k}}\) satisfies the first necessary condition, we compute, for \({\tilde{k}}\in V_{\ker }\cap S_{k_0,l}\), the central term in (65)
where the last equality is from \({\tilde{k}}\in V_{\ker }\cap S_{k_0,l}\). To conclude the check that \({\dot{k}}\) satisfies the necessary condition (65), we need to check that \(\mu \ne 0\). Since \(M\in L^2([0,1]^2)\), note that the necessary condition (66) yields \(\mu =\pm \frac{1}{2}\Vert M\Vert _{L^{2}([0,1]^{2},{{\mathbb {R}}})}\); thus, to finish the proof that \({\dot{k}}\) satisfies both necessary conditions (65)-(66), we will show that \(\Vert M\Vert _{L^{2}([0,1]^{2},{{\mathbb {R}}})}\ne 0\). From the hypotheses on E we conclude
Hence \(\mu =\pm \frac{1}{2}\Vert M\Vert _{L^{2}([0,1]^{2},{{\mathbb {R}}})}\ne 0\). The sign of \(\mu \) is determined by checking the sufficient conditions.
We can now verify that \({\dot{k}}\in V_{\ker }\cap S_{k_0,l}\). We note from \(M\in L^2([0,1]^2)\) and \(\mu \ne 0\), \({\dot{k}}\in L^2([0,1]^2)\). By construction, supp\(({\dot{k}})\subseteq F_l\). Finally, we have
For the sufficient conditions, we note that in this setting \(D_{{\dot{k}}\dot{k }}^2{\mathcal {L}}({\dot{k}},\lambda )({\tilde{k}},{\tilde{k}})\) is the same as in the proof of Theorem 5.4 (since the objectives considered in both this and the other optimisation problem are linear). Hence, the second-order sufficient conditions are satisfied with \(\mu >0\). Thus, with \(2\mu = \Vert M\Vert _{L^2([0,1]^2,{{\mathbb {R}}})}\), (31) satisfies the necessary and sufficient conditions. Next, we note that the set \(P_l=V_{\ker }\cap S_{k_0,l}\cap B_1\) is a closed (Lemma 5.1), bounded, strictly convex set, containing \({\dot{k}}=0\). The objective \({\mathcal {J}}({\dot{k}})=\langle {\dot{k}},E\rangle _{L^{2}([0,1]^{2},{{\mathbb {R}}})}\) is continuous and not uniformly vanishing (Lemma 5.2). Therefore by Propositions 4.1 and 4.3, (31) is the unique solution to the optimisation problem (29)-(30).
We finally show that \(E\in L^\infty ([0,1]^2,{{\mathbb {R}}})\) by supposing \(k_0\in L^2([0,1]^\infty ,{{\mathbb {R}}})\). Recall that
Since \(L_0e = \lambda _{0}e\) and \(L_0^*{\hat{e}} = \lambda _{0}{\hat{e}}\), we have from inequality (7) that \(e,{\hat{e}}\in L^\infty ([0,1],{{\mathbb {C}}})\) since \(k_0\in L^\infty ([0,1]^2,{{\mathbb {R}}})\). Hence, we have that \(\Re (e),\Re ({\hat{e}}),\Im (e),\Im ({\hat{e}})\in L^\infty ([0,1],{{\mathbb {R}}})\) and thus \(E\in L^\infty ([0,1]^2,{{\mathbb {R}}})\). \(\square \)
Appendix C: Upper Bound for the Norm of the Reflection Operator
Lemma C.1
Let \(P_\pi \) be as in (34) and assume that the support of \(f\in L^2({{\mathbb {R}}})\) is contained in N intervals of lengths \(a_j, j=1,\ldots ,N\). Then, \(\Vert P_\pi f\Vert _{L^2([0,1])}\le \left( \sum _{j=1}^N\lceil a_j+1\rceil \right) \Vert f\Vert _{L^2({{\mathbb {R}}})}\), where \(\lceil x\rceil \) denotes the smallest integer greater than or equal to x.
Proof
Using translation invariance of Lebesgue measure, and the fact that for each fixed x there are at most \(\sum _{j=1}^N\lceil a_j+1\rceil \) nonzero evaluations of f in the infinite sum below,
\(\square \)
Appendix D: Proof of Theorem 7.4
Proof
The proof will follow the structure of the proof of Theorem 5.4 . To this end, we consider the following Lagrangian function
where \(f({\dot{T}}) := -\big \langle c, {\widehat{R}}({\dot{T}})\big \rangle _{L^2([0,1],{{\mathbb {R}}})}, \) \(g({\dot{T}}) := \Vert {\dot{T}}\Vert ^2_2-1\) and \({\dot{T}}\in S_{T_0,\ell }\).
Necessary conditions: We want to find \({\dot{T}}\) and \(\mu \) that satisfy the first-order necessary conditions:
where \(D_{{\dot{T}}} {\mathcal {L}}({\dot{T}},\mu )\in {\mathcal {B}}(L^2,{\mathbb {R }})\) is the Frechet derivative with respect to the variable \({\dot{T}} \). Since f is linear, we have \((D_{{\dot{T}}}f) {\tilde{T}} = f({\tilde{T}})\). Also, we have that \((D_{{\dot{T}}}g) {\tilde{T}} = 2\langle {\dot{T}},{\tilde{T}} \rangle _{L^2([0,1],{{\mathbb {R}}})} \) (following the computation in the proof of Theorem 5.4). Thus, for the necessary conditions of the Lagrange multiplier method to be satisfied, we need that
for all \({\tilde{T}}\in S_{T_0,\ell }\) and
Following the proof of Theorem 5.4, we will solve for \( {\dot{T}}\) by rewriting \(f({\tilde{T}})+2\mu \langle {\dot{T}},{\tilde{T}} \rangle _{L^2([0,1],{{\mathbb {R}}})}\) as an inner product on \(L^2\). To this end, we have that
We note that since \(c\in \) span\(\{f_0\}^\perp \), we have from Lemma A.1 that \((\text {Id}-L_0^*)^{-1}c\in L^2\) and the above expression is well defined. Now, from (70), we have that \(f({\tilde{T}})+2\mu \langle {\dot{T}}, {\tilde{T}}\rangle _{L^2([0,1],{{\mathbb {R}}})} = \langle f_0\ {\mathcal {G}}((\text {Id}-L_0^*)^{-1}c)+2\mu {\dot{T}}, {\tilde{T}}\rangle _{L^2([0,1],{{\mathbb {R}}})}\). From this we can conclude that finding \({\dot{T}}\) and \(\mu \) that satisfy (68) and (69) reduces to finding \({\dot{T}}\in S_{T_0,\ell }\) and \(\mu \in {{\mathbb {R}}}\) that satisfy \(\langle f_0\ {\mathcal {G}}((\text {Id}-L_0^*)^{-1}c)+2\mu {\dot{T}}, {\tilde{T}}\rangle _{L^2([0,1],{{\mathbb {R}}})}=0\) for all \({\tilde{T}}\in S_{T_0,\ell }\) and (69). Using the non-degeneracy of the inner product, we find that
where
To conclude that the above \({\dot{T}}\) satisfies the necessary condition (68), we need to check that \(\mu \ne 0\). Since \(M\in L^\infty \) (see the Boundedness of the solution paragraph below), the necessary condition (69) yields \(\mu = \pm \frac{1}{2}\Vert M\Vert _2\); thus, to finish the proof that \({\dot{T}}\) satisfies both necessary conditions (68)-(69), we will show that \(\Vert M\Vert _2\ne 0\). From the hypotheses on \(f_0\) and \((\text {Id}-L_0^*)^{-1}c\), and recalling that \({\mathcal {P}}(x,y)=P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) (x)\), we conclude that
Hence \(\mu = \pm \frac{1}{2}\Vert M\Vert _2\ne 0\); the sign of \(\mu \) is determined by checking the sufficient conditions. We thus have verified that \({\dot{T}}\in S_{T_0,\ell }\) because \({\dot{T}}\in L^2\) and the term \({\mathbf {1}}_{{\widetilde{F}}_\ell }\) in (71) guarantees supp\(({\dot{T}})\subseteq {\widetilde{F}}_\ell \).
Sufficient conditions: As in the proof of Theorem 5.4, we will show that \({\dot{T}}\) in (49) is the solution to the optimisation problem (46)-(47) by checking that it satisfies the second-order sufficient conditions. We first note that in this setting we have \(Q=X=S_{T_0,\ell }\), \(x_0 = {\dot{T}}\), \(Y^* = {{\mathbb {R}}}\), \(G(x_0)=g( {\dot{T}})\), \(K = \{0\} \), \(N_K(G(x_0)) = {{\mathbb {R}}}\), \(T_{K}(G(x_0))=\{0\}\) and \(N_Q(x_0) = \{0\}\). Thus, to show that \(\Lambda ({\dot{T}})\) is not empty, we need to show that \( {\dot{T}}\) and \(\mu \) satisfy
where \(\{0\}^-:=\{\alpha \in {\mathbb {R}}: \alpha x\le 0\ \forall x\in \{0\}\}= {\mathbb {R}}\). Following the argument in the proof of Theorem 5.4, it is easily verifiable that \(\Lambda ({\dot{T}})\) is not empty. Thus, to show that \({\dot{T}}\) is a solution to (46)-(47), we need to show that it satisfies the following second-order conditions: there exists constants \(\nu >0\), \(\eta >0\) and \(\beta >0\) such that
where \(C_\eta ({\dot{T}}):= \big \{v\in S_{T_0,\ell }: |2\langle {\dot{T}},v\rangle _{S_{T_0,\ell }}|\le \eta \Vert v\Vert _{S_{T_0,\ell }}\text { and } f(v)\le \eta \Vert v\Vert _{S_{T_0,\ell }} \big \}\) is the approximate critical cone. Since \(D_{{\dot{T}}}{\mathcal {L}}({\dot{T}},\mu )\tilde{ T} = f({\tilde{T}})+2\mu \langle {\dot{T}},{\tilde{T}}\rangle _{L^2([0,1],{{\mathbb {R}}})} \) and \( \langle {\dot{T}},{\tilde{T}}\rangle _{L^2([0,1],{{\mathbb {R}}})}\) is linear in \({\dot{T}}\), we have that \(D^2_{{\dot{T}}{\dot{T}}}{\mathcal {L}}({\dot{T}},\mu )({\tilde{T}},{\tilde{T}}) = 2\mu \langle {\tilde{T}},{\tilde{T}}\rangle _{L^2([0,1],{{\mathbb {R}}})}\). Thus, we conclude that the second-order condition (73) holds with \(\mu >0\), \( \nu =|\mu | = \frac{1}{2}\Vert M\Vert _{S_{T_0,\ell }} \), \(\beta = 2\mu \) and \(\eta = \max \big \{2\Vert {\dot{T}}\Vert _{S_{T_0,\ell }},\Vert M\Vert _{S_{T_0,\ell }}\big \}\). Since \({\dot{T}}\) satisfies the necessary conditions (68) and (69), with \(\mu >0\), \({\dot{T}}\) is a solution to the optimisation problem (46)-(47). Using Lemma 7.2 and Proposition 7.3, we conclude that this solution is unique.
Boundedness of the solution: We have that \(\left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)\in L^\infty ([0,1]^2)\) (see proof of Lemma 6.5). From inequality (7), with the kernel \(\left( P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) \right) (x)\), we have that \({\mathcal {G}}h\in L^\infty \) for any \(h\in L^2\). Since \(f_0\in L^\infty \), we have that \(f_0\ {\mathcal {G}}((\text {Id}-L_0^*)^{-1}c)\in L^\infty \). Thus, \(M={\mathbf {1}}_{{\widetilde{F}}_\ell } f_0\ {\mathcal {G}}((\text {Id}-L_0^*)^{-1}c)\in L^\infty \) and therefore \({\dot{T}}\in L^\infty \). \(\square \)
Appendix E: Proof of Theorem 7.7
Proof
We use arguments similar to those in the proofs of Theorems 7.4 and 5.6. Let \({\widehat{E}}\) be as in (51). For the necessary conditions, we will need that
for all \({\tilde{T}}\in S_{T_0,\ell }\) and
Thus, from (74) and the nondegeneracy of the inner product we have that \({\dot{T}} = -{\mathbf {1}}_{{\widetilde{F}}_\ell }\frac{{\widehat{E}}}{2\mu }\). To conclude that \({\dot{T}}\) satisfies the necessary condition (74), we need to check that \(\mu \ne 0\). Since \({\widehat{E}}\in L^2\) (as it is essentially bounded, see Proposition 7.5), the necessary condition (75) yields \(\mu = \pm \frac{1}{2}\Vert {\widehat{E}}\Vert _2\). Thus, to finish the proof that \({\dot{T}}\) satisfies both necessary conditions (74)-(75), we will show that \(\Vert {\widehat{E}}\Vert _2\ne 0\). From the hypotheses on E, and recalling that \({\mathcal {P}}(x,y)=P_\pi \left( \tau _{-T_0(y)}\frac{\mathrm{d}\rho }{\mathrm{d}x}\right) (x)\), we conclude that
Hence \(\mu = \pm \frac{1}{2}\Vert {\widehat{E}}\Vert _2\ne 0\) and \({\dot{T}} = \mp {\mathbf {1}}_{{\widetilde{F}}_\ell }\frac{{\widehat{E}}}{\big \Vert {\widehat{E}}\big \Vert _2}\); the sign of \(\mu \) is determined by checking the sufficient conditions. Clearly \({\dot{T}}\in L^2\) and has support contained in \({\tilde{F}}_l\), thus \({\dot{T}}\in S_{T_0,l}\). For the sufficient conditions, as in the proof of Theorem 7.4, since the objective is linear, we require that \(\mu >0\). Using Lemma 7.2 and Proposition 7.6 we conclude that (55) is the unique solution. The essential boundedness of \({\dot{T}}\) follows from the essential boundedness of \({\widehat{E}}\) (see Proposition 7.5). \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Antown, F., Froyland, G. & Galatolo, S. Optimal Linear Response for Markov Hilbert–Schmidt Integral Operators and Stochastic Dynamical Systems. J Nonlinear Sci 32, 79 (2022). https://doi.org/10.1007/s00332-022-09839-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00332-022-09839-0