Emergence of Freidlin–Wentzell’s transmission conditions as a result of a singular perturbation of a semigroup

We present a semigroup-theoretic approach to an example being one of the main motivations of the famous Freidlin–Wentzell averaging principle. In this example, as a result of a singular perturbation of the semigroups involved, the generator of the limit semigroup is related to a transmission condition emerging in a place where no transmission conditions were needed before.


Introduction
The theory of one-parameter semigroups of operators, of beauty and importance of its own, is deeply rooted in the theories of partial differential equations and stochastic processes [13][14][15]23,24,32]. Although there are good textbooks on semigroups which successfully avoid the notion of a stochastic process (see e.g. [14]), it it is quite impossible to imagine the early stages of the theory without the impetus it has gained from the giants like Feller [16][17][18]. In fact, this is not surprising at all, since the very core of both theories lies in understanding dynamics, be it stochastic or deterministic. For example, it has been realized quite early that the main theorems of probability (e.g. the Central Limit Theorem), may be proved in an elegant way using approximation Communicated by Jerome A. Goldstein. B Adam Bobrowski a.bobrowski@pollub.pl 1 Lublin University of Technology, Nadbystrzycka 38A, 20-618 Lublin, Poland theorems for semigroups [4,15,22]. On the other hand, it transpires that if seen from a proper perspective, exponential formulae of the semigroup theory become special instances of limit theorems of probability theory [11].
The theory of semigroups of operators and that of stochastic processes, continue to influence each other with mutual benefit. Modern books on Markov processes (like the old ones, see e.g. [13,18]) treat semigroups, or at least Feller semigroups, extensively and take advantage of the core semigroup-theoretical results [15,25]; in particular, since the publication of the treatise [15] it has become evident that convergence of stochastic processes may be efficiently treated with the help of Trotter-Kato-type theorems. On the other hand, phenomena encountered in stochastic processes do not cease to provide new and interesting challenges for semigroup-theorists.
This article is devoted to a semigroup-theoretical approach to a motivating example of the famous averaging principle of Freidlin and Wentzell [19,21] (see also [20])-although the principle is quite well-known in probabilistic circles, its semigroup-theoretical treatment seems to be still missing. Such a treatment is needed for a clear picture of singular perturbations involving perturbation of boundary and transmission conditions. To explain, the approximation procedure considered here is a close relative of those of [2,5,9,10]. In [5,9,10], as a result of stochastic averaging the limit semigroup is generated by a convex combination of generators of original semigroups. In [2] it is the domain of the generator of the limit semigroup that is perturbed: this domain is the kernel of a convex combination of operators describing domains of generators of original semigroups. In other words, the boundary or transmission conditions are changed in the limit, as a result of averaging. Here, in this paper, the averaging process leads to another, related, phenomenon: it forces emergence of tranmission conditions at a point where they were not needed before.

Semigroup-theoretical tools
The main semigroup-theoretic devices we need are (a) the semigroups generated by operators related to quadratic forms in Hilbert spaces [12,26,27] (a theory that has been intensively developing in recent years, see [1,3,31] and the references cited there), and (b) Kurtz's singular perturbation theorem [15,28,29].
To recall, suppose a sesquilinear form a in a Hilbert space X is sectorial and closed, and that its domain V = V a is dense in X. Then, there exists a linear, closed operator A with domain D(A) satisfying the following conditions: and there is a z ∈ X such that a(x, y) = −(z, y) for all y ∈ V a , then x ∈ D(A) and Ax = z.
Using the Lax-Milgram lemma it may be shown that the range of λ − A is the whole of X provided λ > γ where γ is the vertex of the sector Also, A is uniquely determined by the conditions given above. Moreover, A generates a holomorphic semigroup in X.
Turning to Kurtz's singular perturbation theorem [15,28,29], let ( n ) n≥1 be a sequence of positive numbers converging to 0. Suppose A n , n ≥ 1 are generators of equi-bounded semigroups {e t A n , t ≥ 0} in a Banach space X, and Q generates a strongly continuous semigroup e t Q t≥0 such that exists. Then P is a bounded idempotent operator and Theorem 2.1 (Kurtz's theorem) Let A be an operator in X, D be a subset of its domain, and assume that (a) for x ∈ D, there exist x n ∈ D(A n ) such that lim n→∞ x n = x and lim n→∞ A n x n = Ax, (b) for y in a core D of Q, there exist y n ∈ D(A n ) such that lim n→∞ y n = y and lim n→∞ n A n y n = Qy, (c) the operator P A with domain D ∩ X is closable and its closure P A generates a strongly continuous semigroup in X .
For x ∈ X the same is true for t = 0 as well, and the limit is almost uniform in t ∈ [0, ∞); for other x the limit is almost uniform in t ∈ (0, ∞).

The example
Let us present the details of the example this paper is devoted to. To repeat: this example was one of the main motivations for the averaging principle of Freidlin and Wentzell [19,21] (a semigroup theoretical treatment of another motivation for this principle is discussed in the companion article [8]). Imagine N diffusions (with different diffusion and drift coefficients) on N copies of an interval with Neumann boundary conditions at the ends, see Fig. 1. Suppose also that, as in e.g. [5,9,10], these diffusions are coupled by Markov chains: while at the ith copy, the process behaves according to the rules governed by operator  Fig. 2 The limit process on a graph is generated by an averaged operator A (a convex combination of the involved operators A i ) with Neumann boundary conditions at the graph's ends, and balance, transmission conditions at point x * where the segments meet, i.e. at the new vertex of the graph but after a random time depending on its position, may jump to another copy of the interval to behave according to the rules described by the operator defined there. In distinction to the situation of [5,9,10] we assume, however, that on the left part of the interval no communication is possible: the intensities of jumps are zero here. Freidlin and Wentzell's result says that [19,Theorem 5.1] as the intensities of jumps (in the right part of the interval) tend to infinity, the processes involved converge weakly to a diffusion on a graph formed by identifying corresponding points of all the right-parts of the intervals, see Fig. 2. The generator of the limit process is a convex combination of the generators of the involved diffusions-a phenomenon thoroughly studied in [5,9,10]. However, a new phenomenon is observed here as well: at the junction x * where the intervals meet, transmission conditions need to be introduced. They are of the form: where π i (x * ), i = 1, . . . , N are probabilities of the equilibrium state of the Markov chain at x * , f i,− (x * ) is the left-hand derivative of f at x * calculated on the ith interval, and f + (x * ) is the right-hand derivative of f at x * calculated on the edge formed by amalgamating right parts of the original intervals.

Analysis in L 2
In what follows we drop secondary features of the example to focus on the reason for emergence of transmission condition (3.1) and its unique form. More specifically, we consider the case of two intervals (i.e. we take N = 2), and assume that diffusion coefficients are constant throughout these intervals and there is no drift at all, so that a i (x) = a i > 0 and b i (x) = 0 for x ∈ [0, 1], and some constants a i > 0. (The effect of convex combination has been studied in [5,9] for processes much more general than diffusion processes, so we would gain no generality by introducing variable coefficients here.) Moreover, we assume that for some x * ∈ (0, 1), the Kolmogorov matrix does not depend on x ∈ [x * , 1] and equals where α, β > 0 are given constants (so that the stationary distribution is ( α α+β , β α+β )); for x ∈ [0, x * ), the Kolmogorov matrix is 0.
Since the operator related to such a choice of Kolmogorov matrices (see below, Eq. (4.2)) does not leave the space of continuous functions invariant (the original Friedlin-Wentzell theorem [19] concerns a weak convergence, with test functions being continuous), we will work in the Hilbert space H of pairs f = ( f 1 , f 2 ) of square integrable functions on the unit interval, equipped with the scalar product For κ > 0, we define a sesquilinear form Here, the domain V of the form a is the space It is clear that both a and q are symmetric, and A direct calculation shows that the operator related to a (see Sect. 2) is Moreover, the operator related to q is given by 1] is the indicator function of the interval [x * , 1]. It follows that both A and Q generate semigroups of self-adjoint contraction operators, and the same is true for the operators A κ = A + κ Q related to forms a κ .
The problem is that of finding the limit lim κ→∞ e t A κ .
To this end, we introduce H 0 as the subspace of H composed of ( 1] = 0; such pairs may be identified with square integrable functions on a Y -shaped graph obtained by removing the middle segment in the left-hand part in Fig. 2.
by definition of H 0 , f 1 and g 1 in the second integral may be replaced by f 2 and g 2 , respectively, without altering b. Again, and b is symmetric. Integration by parts shows that the related operator B in H 0 is given by separately (one sided derivatives at x * may differ), where + and − denote the right-sided and left-sided derivatives, respectively.
(Again, f 1,+ (x * ) on the left-hand side may be replaced by f 2,+ (x * ).) Certainly, (4.3) is a counterpart of (3.1): these conditions inform of a flux balance; we stress that along with this transmission condition, the continuity condition at x * is tacitly assumed (as implied by (A) above).

Theorem 4.1 We have
strongly and almost uniformly in t > 0, where projection P ∈ L(H) is given by To prove this result, we need Simon's theorem [33] saying roughly that convergence of positive symmetric forms implies convergence of resolvents of the related operators. A version of this theorem, due to Kato, presented in the classic treatise [26] is slightly weaker as it assumes that the limit 'upper bound' form is densely defined (which is not the case in the example we study). We note that Simon's theorem has recently been generalized by Batty and ter Elst to the case of series of sectorial forms [3].
and c(x, y) = lim n→∞ c n (x, y). Then, c is closed, positive and symmetric. Moreover, denoting by C the closed operator related to c, defined in H 0 = D(c), and by C n the closed operators related to c n defined in H, we have Proof of Theorem 4.1 Since q ≥ 0, the forms a κ increase with κ, so that we are in the set-up of Simon's theorem. It is clear that f ∈ H 0 . Therefore, the limit form coincides with a restricted to V 0 and this equals b. By Simon's theorem it follows that It follows that lim t→∞ e t Q = P. Finally, for f ∈ D(B) we have In particular, P A is is closed, proving assumption (c) in Kurtz's theorem and completing the proof.

Analysis in L 1
The approach of the previous chapter is quite elegant: the quadratic forms contain all the information needed for the limit theorem. The arguable elegance, however, comes perhaps at the cost of blurring the mechanism of emergence of transmission conditions in the limit. To explain: the information about these conditions is compressed, or, so to say, 'zipped' in the quadratic form b corresponding to the operator B. While it seen without a shadow of a doubt that the limit form cannot be anything other than b, from the perspective of forms it is still somewhat difficult to grasp the way transmission conditions come into existence. In other words, with forms, the picture is quite clear, but some part of the mystery is still there-unless you master the connection between the operator and the form. Therefore, in this chapter, we present another approach, where calculations are much more explicit, if a bit complex. As we shall see, convergence of semigroups involved may be deduced from convergence of solutions of a linear system of equations with four unknowns (see (5.9)).
To this end, we work with the semigroups of Markov operators (contraction operators that preserve the positive cone, see e.g. [30]) in the space of absolutely integrable functions. These semigroups are responsible for evolution of densities of the Markov processes involved (and are perhaps a bit more natural than those of the previous chapter). More specifically, we work with the space X = L 1 (R) × L 1 (R) identified with the space of integrable functions f : R × {1, 2} → R. In other words, each pair ( f 1 , f 2 ) ∈ X is identified with such a function, defined by f (x, i) = f i (x), i = 1, 2, x ∈ R. The norm in X is given by Given two diffusion coefficients a 1 , a 2 > 0 we define for f 1 , f 2 ∈ W 2,1 (R). Then, A generates a semigroup of Markov operators in X: the reader has noticed probably that we got rid of reflecting (Neumann) boundary conditions (which have no bearing on the phenomenon under study) and have allowed Brownian particles to diffuse freely on two copies of the real axis. Next, for given intensities α, β > 0 we define bounded linear operators in X by and P = 1 α+β Q + I X , so that It is easy to see that P is a Markov operator; it follows that Q generates a semigroup e t Q t≥0 of such operators. This semigroup describes the process in which states (x, i) where x < 0 are absorbing, while (x, 1) and (x, 2) communicate as states of a Markov chain with intensity matrix (4.1). (In particular, it is the point 0 that plays now the role of x * of the previous chapter.) Therefore, the Phillips perturbation theorem combined with the Trotter product formula, implies that for each κ > 0 the operator generates a semigroup of Markov operators in X. (This semigroup is in a sense dual to that generated by A κ of the previous chapter.) Since we want to find the limit lim κ→∞ e tA κ , we turn to studying the resolvent equation λ f − A κ f = g, for λ > 0 and g ∈ X. As we shall see, the solution f ∈ D(A) (which exists and is unique, A κ generating the contraction semigroup) may be found in a quite explicit way.
To begin with, we note that on the left half-axis, the resolvent equation takes the form Basic principles of ordinary differential equations tell us that there are solutions to this equation that are integrable on the left half-axis, and they are given by where l 1 , l 2 are (yet) unknown constants. On the right half-axis, the resolvent equation takes the form More precisely, this is (a part of) the resolvent equation for A 1 ; the general case will be recovered later by replacing each instance of α and β by κα and κβ, respectively. Moreover, g 1 , g 2 are now treated as members of L 1 (R + ) and solutions are also sought in this space. The question of existence of these solutions is answered in Lemma 5.1, below, but we need to make some preparatory remarks for this result. The quadratic equation: has precisely two real solutions: Moreover, since √ < (λ + α)a 1 + (λ + β)a 2 , these solutions are positive; we note that We look for (the first coordinate of) solutions of (5.3) of the form 1 where k 1 , k 2 are constants, e t (x) = e − √ t x , x ≥ 0 and, for t > 0 and h ∈ L 1 (R + ), and We note that G t is a bounded linear operator in L 1 (R + ).

solves (5.3) (for any constants k 1 , k 2 ).
Proof By the very definition of f 2 , the first equality in (5.3) is satisfied, and we are left with proving the second one. To this end, we note that G t h ∈ W 2,1 (R + ), and It follows that we obtain Therefore, invoking (5.7) (again) and the definition of h i (for the first time), since t i s are roots of the quadratic. For the same reason, the terms involving g 1 cancel out, both being equal to αβg 1 . To summarize, using (5.4) one more time, This proves the second equality in (5.3). Now, the pair ( f 1 , f 2 ) ∈ X defined by (5.2), (5.5) and (5.8) belongs to D(A) iff The first of these conditions (compatibility of values) may be written as Since the other condition (compatibility of derivatives) reads λ a 1 l 1 − λ Hence, we have the following linear system of equations for l 1 , l 2 , k 1 , k 2 : where for further simplicity of notation s i = λ a i . The Jordan-Gauss elimination method (stopped one step before completion) yields now Before continuing, we present a lemma summarizing asymptotic behavior of constants and functions appearing in the definition of f 1 and f 2 .
If each occurence of α and β is replaced by κα and κβ, respectively, then Proof Except for (vi), all claims are immediate by standard calculus, if proven consequtively. To show (vi), we note that for each t > 0, implying lim t→∞ F t h = 0. Therefore, because L 1 (R + ) ∩ L ∞ (R + ) is dense in L 1 (R + ) and the functionals are equibounded. Since √ t 1 C 1 = F t 1 √ t 1 h 1 and lim κ→∞ t 1 = ∞, this combined with (v) completes the proof.
We are finally able to state the first of the two main results of this section: it provides information on convergence of resolvents.

Proposition 5.3 If each occurence of α and β is replaced by κα and κβ, respectively, then:
(a) For κ large enough, (5.9) has a unique solution. Moreover, equations (5.2), (5.5) and (5.8) with k 1 , k 2 , l 1 , l 2 calculated from (5.9), give the solution to the resolvent equation for A κ . (b) As κ → ∞, the solutions to the resolvent equations for A κ converge to ( f 1 , Proof (a) For the main determinant, say W , of the last two equations in (5.9), we have Since lim κ→∞ √ t 1 = ∞, this shows uniqueness of k 1 and k 2 in (5.9), which in turn implies uniqueness of l 1 and l 2 (by the first two equations). The rest is clear.
All that is left to do now is to interpret, or decipher, this result and provide the link with the Freidlin-Wentzell transmission conditions. To this end, let This subspace of X is isometrically isomorphic to L 1 (Y ), the space of integrable functions on an 'infinite y'-shaped graph depicted at Fig. 3. The latter space in turn may be identified with the isometric isomorphism I : X 0 → Y being given by Next, let B 0 be the operator in Y given by satisfy the transmission conditions (5.14) These transmission conditions are dual to those of the previous chapter: to be more precise, they describe the same physical/biological phenomenon, yet in a different, 'dual' space. Now, the isomorphic copy of B 0 in X 0 is given by where χ i = a i 1 (−∞,0) + a a 1 [0,∞) , on the domain composed of ( f 1 , f 2 ) such that f 1 and f 2 are continuous on R (so that in particular (5.15) this corresponds to the first two conditions in (5.14)), and (It goes without saying that ( f 1 , f 2 ) ∈ X 0 .) It is quite easy to solve the resolvent equation for B. Given λ > 0 and (g 1 , g 2 ) ∈ X, we have Here, the constants m 1 , m 2 and n are chosen so that transmission conditions (5.15) and (5.16) are satisfied, i.e., aa y g 1 (y) dy and, as previously, Finally, for (g 1 , g 2 ) ∈ X 0 , α (g 1 + g 2 ) |R + = g 1|R + so that C 2,∞ = C, and (5.17) is the same as (5.11), except perhaps for the constants. Moreover, a bit of algebra shows that for k of (5.12), we have then This, however, is just the assertion that k is the same as n of (5.18). It follows that m i = l i , i = 1, 2 and so More generally, R λ = (λ − B) −1 P, (5.20) where P is defined in (5.1).
Theorem 5. 4 We have strongly and almost uniformly in t > 0.
Proof Relation (5.20) (in fact, (5.19) suffices) shows that condition (a) of Kurtz's singular perturbation theorem is satisfied. The rest of the argument is precisely the same as in the proof of Theorem 4.1.
To summarize, we have proved a 'dual' version of the main result of the previous section. In contrast to the arguments presented there, here the question of convergence of resolvents is reduced to that of a (singular) convergence of solutions of a system of linear equations in R 4 . Probabilistically, the main theorem of this section speaks of convergence of densities of the related stochastic processes in L 1 norm.

A cosine family
In this (last) section, we want to show that the operator B 0 introduced in (5.13) and (5.14) generates a bounded cosine family of operators. (Of course, this implies that so does its isomorphic image B in X 0 .) To begin with, we note that in the definition of B 0 , the constants α and β do not appear directly, but merely via α and β . Therefore, from now on and without loss of generality, we assume that α + β = 1. This will simplify notations: in (5.10) and (5.14) the primes may be dropped.
We use Lord Kelvin's method of images [6,7]. To this end, let , and let C = {C(t), t ∈ R} be the cosine family in Z defined as follows.
where σ i = √ a i , i = 1, 2 and σ 3 = √ a a . Our aim is to show that Y (equivalently: L 1 (Y )) is isomorphic to a certain subspace, say Z 0 , of Z which is invariant for C (see Lemma 6.1). This will allow describing B 0 as the generator of the isomorphic image of C restricted to Z 0 (see Theorem 6.2).
These operators preserve the integral, i.e., FC B 0 (t) f = F f for all f ∈ Y, t ∈ R, where F ∈ Y * is defined by Proof The d'Alembert formula for C B 0 and strong continuity of t → C B 0 (t) are direct consequences of the corresponding properties of C-use (6.4). The operators in the abstract Kelvin formula are equibounded because C(t), t ∈ R are Markov operators in Z (and hence are contractions). Preservation of the integral is an immediate consequence of (6.2) coupled with fact that C(t), t ∈ R preserve the integral in Z.
Thus, all we need to show is that B 0 is the generator of the cosine family defined by (6.5). We note that the domain of the generator, say G, of C is W 2,1 (R) × W 2,1 (R) × W 2,1 (R), and G( 2,3 , and that the generator of C restricted to Z 0 is the part G p of G in X 0 . Since on the right-hand side of (6.5) instead of C(t) we could write its restriction to Z 0 , we conclude that the generator of the cosine family given by the Kelvin formula is the isomorphic copy of G p (via J ).
Let f ∈ D(G p ). Then, the coordinates f 1 , f 2 and f 3 of f are members of W 2,1 (R). Since the right-hand and left-hand limits of f i 's at x = 0 must coincide, the last equation in (6.1) implies, after simple algebra, that