Three-Scale Singular Limits of Evolutionary PDEs

Singular limits of a class of evolutionary systems of partial differential equations having two small parameters and hence three time scales are considered. Under appropriate conditions solutions are shown to exist and remain uniformly bounded for a fixed time as the two parameters tend to zero at different rates. A simple example shows the necessity of those conditions in order for uniform bounds to hold. Under further conditions the solutions of the original system tend to solutions of a limit equation as the parameters tend to zero.


Introduction
Many physical systems contain several small parameters, such as the Mach number, Alfvén number, Froude number, Rossby number, etc.. When these parameters are considered to have fixed ratios to one another then the system has two time scales: one induced by the terms containing the small parameters and the other coming from the order-one terms in the equation. The classical theory of singular limits for evolutionary partial differential equations (PDEs) ( [1,4,5,7,9,12,13] and numerous papers on particular systems, for example [11]) was developed to treat this case. In order to determine the behavior of solutions when two physical parameters tend to zero in a different manner it is necessary to develop an analogous theory for systems with three time scales. The systems to be considered here have the form where ε and δ are small parameters. As in the theory of two-scale singular limits, the system without the large terms is assumed to be symmetric hyperbolic, and L and M are assumed to be antisymmetric constant-coefficient differential or pseudodifferential operators of order at most one. As for two-scale singular limits [4,7], parabolic terms of size O(1) could be added to the right side of (1.1), although the complications such terms induce would be greater in the three-scale case. The fundamental discovery of Klainerman and Majda [7,8] for two-scale singular limits was that the presence of the small parameter in the matrix A 0 , which occurs naturally in the normalized equations for low Mach number fluid flow, induces a delicate balance. As they showed, this ensures that solutions of (1.1) with δ = 1, having fixed initial data belonging to a Sobolev space of sufficiently high index, exist for a time independent of the small parameter ε and satisfy bounds independent of that parameter, without the need for additional conditions on the large terms or the initial data, such as those assumed in [1,12] to treat the case when A 0 depends on u rather than εu. Whenever the small parameter δ in (1.1) is not asymptotically smaller than ε, that is, when δ cε for some arbitrarily small positive constant c, then the Klainerman-Majda balance is essentially preserved and their uniform existence result remains valid and requires only cosmetic changes to the proof. Similarly, when A 0 is a constant matrix, as in the rotating shallow water equations ([10, Equation (2.2)]) then the Klainerman-Majda uniform existence result remains valid for arbitrary δ and ε.
Hence we will be concerned here with the case when A 0 does depend nontrivially on εu, and 0 < δ ε 1. (

1.2)
Our first main result is a uniform existence theorem under two additional assumptions. The first condition is that is the Sobolev embedding exponent in dimension d. The second condition is that the initial data u 0 (x, ε, δ) are uniformly bounded in the Sobolev space H s 0 +1 (D) and are "well-prepared" in the usual sense that the initial time derivative is uniformly bounded in H s 0 (D), with the domain D being either the whole space R d or the torus T d . Examples of initial data satisfying this condition are given in (3.10) below. For convenience, we shall henceforth omit the spatial domain in integrals and function spaces throughout the paper. Although (1.3) limits how small δ can be compared to ε, it is consistent with the scaling (1.2) that violates the Klainerman-Majda balance. Moreover, both conditions are necessary, at least for obtaining uniform bounds on solutions of general systems without Klainerman-Majda balance, as will be shown via a simple explicit example. Our other main result is a convergence theorem showing, under the additional assumptions described below, that as ε and δ both tend to zero solutions of (1.1) whose initial data converge in H s 0 +1 tend to the solution of a certain limiting equation. The framework of the convergence theorem is the same as for two-scale singular limits; the bounds of the existence result yield compactness, which implies that every sequence of ε and δ tending to zero while obeying (1.3) has a subsequence for which the solution converges, and convergence without restricting to such subsequences is obtained by showing that the limit satisfies a limit equation for which solutions of initial-value problems are unique. However, both the form and the derivation of the limit equation are more complicated for three-scale singular limits. For the two-scale singular limit obtained when δ ≡ 1, the limit equation is obtained by decomposing (1.1) into the projections onto the null space of M and onto its orthogonal complement, multiplying the latter by ε and taking the limits of the results. However, in order to obtain the limit equation for the three-scale singular limit in which (1.2), (1.3) holds it is necessary to use perturbation theory to compute some number of terms of the power series in the small parameter μ = δ ε for the eigenvalues and eigenspaces of L +μM in Fourier space. The number of terms required and the resulting limit equation depend on the relationship between δ and ε as they both tend to zero. In order to obtain convergence without restricting to subsequences it is necessary to restrict the relationship between δ and ε so as to obtain a specific limit equation. This requires the additional assumption that for some integer s s 0 either that is not an integer then (1.7) holds with s = r . The limit equation is different for different values of s and even for different values of C in (1.6), but is the same for all r in (s, s +1). The reason that the limit equation depends on C is that when (1.6) holds then the limit equation contains a term T lim arising from the power series expansion in δ of 1 δ (L +μM ). Moreover, although both L and M are both bounded operators from H 1 to L 2 it turns out that T lim may not be, as will be explained in Definition 4.4 and Remark 4.5 below. Such terms do not occur in two-scale singular limits. As a result, the second time derivative of the limit solution may not belong to L 2 , although the limit process ensures that its first time derivative does belong to L 2 .
After presenting the example showing the necessity of our conditions for obtaining uniform bounds in Section 2, the uniform existence theorem will be formulated precisely and proven in Section 3, and the convergence theorem will be formulated precisely and proven in Section 4. Some simple examples of the perturbation procedure and the limit equations will also be presented in that section. In forthcoming work the results here will be applied to the problem that motivated this research, namely the simultaneous zero Alfvén number and zero Mach number limit of the scaled compressible inviscid MHD equations where the small parameters ε M and δ A are respectively the Mach number and Alfvén number, the fluid density is 1 + ε M r , its velocity is u, the magnetic field is e z + δ A b with e z being the unit vector in the z-direction, and the coefficient functions a and R depend on the constitutive relation giving the fluid pressure as a function of its density.

Example
Consider the system which has the form (1.1), together with the initial data that satisfy the condition that the initial time derivative be uniformly bounded.
Arguments that we will make regarding this simpler system could be adapted to more complex versions. For example, the system could be turned into one in which the large terms involve derivatives with respect to an additional spatial variable y by replacing the terms − 1 δ v and 1 δ u by 1 δ v y and 1 δ u y , respectively, and changing u 0 to δ cos y. A term containing 1 ε could also be added. It will be convenient to write the solution to (2.1), (2.2) in terms of which satisfies Differentiating (2.5) or its derivatives with respect to t produces a term containing a factor 1 δ , while differentiating with respect to x produces a term containing a factor ε δ since the x-dependence in the exponent of (2.5) lies inside a(ε·). Taking into account the factor of δ in (2.5) that comes from the initial condition, this shows that for some function z k, that is not identically zero provided that both a and w 0 genuinely depend on their argument. The standard existence theory for symmetric hyperbolic systems in spatial dimension d requires obtaining a bound on the H s 0 +1 norm of solutions. The system (2.1) can be considered to be a system in any dimension, and estimate (2.6) implies that the solution of (2.1), (2.2) will be uniformly bounded in H s 0 +1 only when ε s 0 +1 δ s 0 is bounded, which requires that (1.3) must hold. Moreover, if the condition that the initial data must be well prepared is dropped then the initial value of u in (2.2) can be 1 rather than δ, which makes (2.6) more singular by one power of δ. The condition that the H s 0 +1 norm of the solution be uniformly bounded then requires that ε s 0 +1 δ s 0 +1 be bounded, that is, that δ cε, so no general result beyond the Klainerman-Majda balance is then possible.

Scaling
Estimate (2.6) implies that the derivatives of solutions (u, v, w) of (2.1) satisfy corresponding estimates, except that (u, v, w) itself and its pure spatial derivatives are no smaller than O(1) because that is the size of the component w. These estimates suggest that the appropriate norm of solutions of (1.1) to estimate would be where as usual D α denotes the spatial derivative d j=1 ∂ α j x j of order |α| := d j=1 α j . Although our method indeed allows us to estimate the weighted norm (3.1) of solutions, doing so requires keeping an exact count of the spatial derivatives appearing in instances of the Gagliardo-Nirenberg inequalities (3.16) below. In order to avoid the need to count spatial derivatives we will instead perform a simplified estimate by using weights that depend only on the number of time derivatives, with the weight of the term ∂ k t u and its spatial derivatives equal to ε k . These weights equal their counterparts in (3.1) for the highest spatial derivative of ∂ k t u, under the assumption that equality holds in (1.3). For lower-order spatial derivatives when that equality holds, or for all cases when strict inequality holds in (1.3), the weights we use are smaller than their counterparts in (3.1). Hence the simplified estimate will be somewhat weaker than the estimate that would be obtained using (3.1). With one exception this difference is of little importance, because estimates of norms of time derivatives of solutions weighted by small constants are simply a means of obtaining an unweighted estimate for the spatial norms of solutions. The one exception is that the L 2 norm of u t in (3.1) has weight one and so yields a uniform bound, while the L 2 norm of u t in the simplified scheme has weight ε and so does not yield a uniform bound. Obtaining a uniform bound for some norm of u t is important for the convergence theory, and it will turn out that the time evolution of the unweighted L 2 norm of u t can be estimated in terms of the norms appearing in the simplified estimates, so we will simply adjoin the unweighted L 2 norm of u t to the simplified scheme of estimates.
However, as is common in the theory of hyperbolic systems, we must modify the standard L 2 and H s norms to include the coefficient matrix A 0 (εu) of the timederivative term in the PDE (1.1), with the argument εu of A 0 taken from some solution to (1.1). We therefore define v, The corresponding quantities with the subscript A 0 omitted will denote the standard inner product and norms in which A 0 is replaced by the identity matrix. Assumption 3.3 together with the estimates to be obtained will ensure that the two are equivalent for the time intervals considered here. The definitions in (3.2) are a slight abuse of notation, both since the argument of A 0 is usually not given explicitly but must be understood from the context, and because the value of |||u||| s,ε,A 0 and ||||u|||| s,ε,A 0 at a given time does not depend solely on the value of u at that time on account of the inclusion of time derivatives.
Remark 3.1. The standard existence theorem for symmetric hyperbolic systems ([9, Ch. 2, Theorem 2.1]) shows that there exists a unique solution, for some time that may depend on δ and ε, to the initial-value problem consisting of (1.1) together with an initial condition u(0, x) = u 0 (x, δ, ε) ∈ H s 0 +1 , and, moreover that solution continues to exist as long as its H s 0 +1 norm remains finite ([9, Ch. 2, Theorem 2.2]). Hence in order to prove that the time of existence can be taken to be independent of δ and ε it suffices to obtain a uniform bound on the H s 0 +1 norm of the solution. The proof of the existence theorem uses estimates in which the function u appearing inside A 0 in the norm ,A 0 from (3.2) differs from the solution being estimated. However, since in this paper the solutions being estimated are already known to exist the function u appearing inside A 0 in the norms (3.2) will simply be the solution that is being estimated.
Remark 3.2. The standard energy estimates for both symmetric hyperbolic systems without large terms and for singular limits obeying Klainerman-Majda balance involve only spatial derivatives of the solution. The reasons that time derivatives are also needed here and why an unweighted estimate for the time derivative is only obtained in the L 2 norm will be explained after the proof of Theorem 3.6.

Assumptions and Initial Data
The following standard conditions on the terms appearing in system (1.1) will be assumed, where s 0 is defined in (1.4): functions of their arguments. 2. The matrix A 0 is positive definite; more specifically there are positive constants c 0 and b 0 such that The operators L and M are anti-symmetric constant-coefficient differential or pseudodifferential operators of order at most one.
Remark 3.4. The identity together with the definitions in Assumption 3.3 show that As noted in the introduction, the initial data will be required to be chosen so that u t (0, x) from (1.5) is uniformly bounded in H s 0 . From the PDE (1.1) we see that this well-preparedness condition is equivalent to the condition that Under the above conditions Lemma 3.5 below shows that the |||| |||| s 0 +1,ε,A 0 norm of u will be uniformly bounded at time zero. In the statements of both this result and the main theorem we will use the Sobolev embedding constant, that is, the constant K such that (3.6) Lemma 3.5. Assume that initial data satisfy for all 0 < ε ε 0 and 0 < c 1 ε for all δ and ε satisfying the above conditions.
Proof. Roughly speaking, the result of the lemma follows from the fact that when u t (0, x) is O(1) then using the PDE (1.1) plus induction shows that at time zero s 0 , and hence yields the uniform boundedness of the |||| |||| s 0 +1,ε,A 0 norm of u at time zero.
More specifically, by repeated applications of the PDE (1.1) to express higher time derivatives in terms of u, u t and their spatial derivatives, and applications of L and M to them, we obtain that, for 2 k s 0 + 1, the leading-order term of To see this note first that the assumptions on the initial data ensure that A 0 c 0 I . Applying ∂ k−1 t to (1.1), using the invertibility of A 0 to solve the result for ∂ k t u, taking up to s 0 + 1 − k spatial derivatives of the result, and summing the L 2 norms of the results yields a formula for the H s 0 +1−k norm of ∂ k t u in terms of L 2 norms of products of spatial derivatives of lower-order time derivatives of u. Note that coefficients such as A j (u) can be estimated in the maximum norm in terms of u s 0 and so may be pulled out of those L 2 norms.
For the case k = 2 this yields an estimate of u tt s 0 −1 in terms of L 2 norms of products of the factors u, u x u t , G and G t and their spatial derivatives of order at most s 0 , with coefficients of size at most O( 1 δ ) coming from the presence of 1 δ in the time derivative of (1.1). Since all those factors are bounded in H s 0 at time zero, and H s 0 is an algebra, this yields the estimate u tt t=0 s 0 −1 c δ . The analogous expressions for ∂ k t u s 0 +1−k with k > 2 include factors of u tt and possibly higher time derivatives, plus their spatial derivatives. Although u tt and higher-order time derivatives of u do not belong to H s 0 at time zero, the resulting expressions could be estimated by the method used in the proof of Theorem 3.6 to estimate similar expressions. However, it is simpler to use finite induction to express higher-order time derivatives in terms of u and u t . Since the time derivative in (1.1) is expressed in terms of expressions involving at most one spatial derivative, this again yields an estimate in terms of L 2 norms of products of the factors u, u x , u t and their spatial derivatives of order at most s 0 , plus time and spatial derivatives of G of order at most s 0 , this time with coefficients of size at most O( 1 δ k−1 ) since equation (1.1) is used at most k − 1 times to express k − 1 time derivatives in terms of spatial derivatives. Note that wherever u t and its spatial derivatives occur the time derivative is left unaltered rather than using (1.1) to express u t in terms of u, because u t is O(1) at time zero but the individual terms on the right side of (1.1) may not be. This yields the estimates ∂ k t u t=0 s 0 +1−k c δ k−1 . As indicated at the beginning of the proof, these estimates together with assumption (3.8) show that the |||| |||| s 0 +1,ε,A 0 norm of u is uniformly bounded at time zero.
The well-preparedness condition (3.5) can be achieved, for example, by using initial data of the form for some nonnegative integer m, with the u j belonging to H s 0 +1 and U 0 bounded in that space uniformly in δ and ε. In fact, since in view of the scaling assumption (1.2), condition (3.5) will hold provided that and either M u m = 0 or δ m cε m+1 . When the ranges of L and M overlap, the condition (3.11) allows more general initial data than would be obtained by requiring that each side of those equations vanish separately.

Theorem and Proof
Proof. The local existence and continuation theorems ([9, Ch 2., Theorems 2.1-2.2]) mentioned in Remark 3.1 ensure that the solution of the initial-value problem exists on some time interval that might depend on δ and ε, and will continue to exist for a time independent of those small parameters provided that it satisfies an H s 0 +1 estimate independent of them. Hence it suffices to prove such an estimate. Moreover, although the norm |||| |||| s 0 +1,ε,A 0 used in the estimates below depends on the solution u being estimated, condition (3.3) ensures that the resulting estimate will indeed be uniform. The estimates that will be derived are similar to standard energy estimates for solutions of symmetric hyperbolic systems but require keeping track of the powers of δ and ε that appear in those estimates for the system (1.1).
Applying D α ∂ k t with 0 k s 0 + 1 and 0 |α| ≤ s 0 + 1 − k to (1.1), taking the inner product with 2D α ∂ k t u, integrating over the spatial variables, integrating by parts in the terms that involve A j undifferentiated, noting that the terms involving L or M drop out on account of the anti-symmetry of those operators, summing over all α satisfying the above-mentioned condition, and multiplying the result by the weight ε 2k yields where the inequality is obtained by pulling out εu t · ∇ u A 0 + j u x j · ∇ u A j from the first integral in maximum norm, and breaking the second integral into several parts and using the Cauchy-Schwartz inequality in each of them.
Since A 0 = A 0 (εu) will be differentiated at least once when it appears in any commutator term on the right side of the inequality in (3.12), which yields at least one power of ε, the power of ε in every term appearing on the right side of the inequality in (3.12) is at least as large as the total number of time derivatives in that term. By the definition of the ||| ||| s 0 +1,ε,A 0 norm plus the smoothness assumption on A 0 , this implies that in order to bound the right side of the inequality in (3.12) by a continuous function of |||u||| s 0 +1,ε,A 0 it suffices to bound all the terms there by a continuous function of |||u||| s 0 +1,1 after replacing ε by 1 and replacing A 0 by the identity matrix.
The condition on s 0 ensures that u t L ∞ and ∇u L ∞ are bounded by a constant times u t s 0 and u s 0 +1 , respectively, and those norms are each bounded by |||u||| s 0 +1,1 . By the smoothness of the A j , d j=0 ∇ u A j L ∞ c( u s 0 ) c(|||u||| s 0 +1,1 ) for some continuous function c. This yields the desired estimate for the entire first term on the right side of the inequality in (3.12). The terms on the right side of the inequality in (3.12) in which G and the L ∞ norm of H appear are also so bounded in view of the assumptions of those functions.
There remains to estimate only the terms on the right side of the inequality in (3.12) that involve commutators. Since the factor ∂ k t u s 0 +1−k multiplying the norms of the commutators is one of the terms in |||u||| s 0 +1,1 , only the norms of the commutator terms themselves must be estimated. We can pull out in the L ∞ norm any factor such as ∇ u H that depends only on t, x and u without derivatives, and the assumptions on the various coefficients ensure that each factor so pulled out is bounded by a continuous function of u s 0 and hence by a continuous function of |||u||| s 0 +1,1 . Since the presence of the commutator ensures that at least one derivative will be applied to the function appearing in the commutator, the terms arising from the commutators that remain inside the L 2 norms all take the form where L 2, 1 |α |+k s 0 +1, and (|α |+k ) s 0 +2. If |α |+k = s 0 + 1 for some then only one derivative is applied to the other factor, so that factor can be pulled out in L ∞ norm and estimated by u s 0 +1 or u t s 0 , both of which appear in |||u||| s 0 +1,1 . After pulling out that factor the integral becomes |D α ∂ k t u| 2 dx, which is bounded by ∂ k t u 2 s 0 +1−k , which also appears in |||u||| s 0 +1,1 . Otherwise |α | + k s 0 for all , and by using the multiple-factor version of Hölder's inequality we will bound the integral in (3.13) by where the exponents p , which will be chosen later, must satisfy The integrals in (3.14) will then be bounded via the Gagliardo-Nirenberg inequality (for example, [3, p. 24 in which the parameters must satisfy 1 p = 1 2 − ar d , r 1, and 0 a < 1, where as usual d is the spatial dimension. Although a is actually allowed to equal the endpoint value 1 for many values of the other parameters, we will avoid that value in order to obtain a unified proof. The inequality constraint on a will hold provided that 1 2 1 p > 1 2 − r d . In order to estimate the integrals in (3.14) we apply (3.16) with v := D α ∂ k t u, so we will let r = s 0 + 1 − (|α | + k ), since that is the highest Sobolev norm of D α ∂ k t u that is bounded by |||u||| s 0 +1,1 . Since we only use (3.14) when |α | + k s 0 for all , the condition r 1 will indeed hold. Since the norm of D α ∂ k t u appearing in (3.14) is the L 2 p norm, p in (3.16) equals 2 p . Substituting in these values and multiplying everywhere by two turns the inequality constraint on p into the inequality constraint where for simplicity we ignore the possibility p = ∞, which will not be needed. Every value of p satisfying (3.19) is allowed by both (3.17) and (3.18), so it suffices to show that we can choose values in the intervals in (3.19) that sum to one. That is possible iff the sum of the lower values there is less than one and the sum of the upper values is at least one. The latter condition holds trivially, and as noted above the second expression inside the max in (3.19) is negative iff |α | + k = 1, so it suffices to show that Let L 2 denote the number of values of for which |α | + k 2. If L 2 = 0 then the sum on the right side of (3.20) vanishes, so that condition indeed holds. When L 2 1 then (3.20) can be written more explicitly as Condition (3.21) can be rewritten as Since L 2 1 by assumption, s 0 > d 2 , L 2 and L =1 (|α |+k ) s 0 +2, the left side of (3.22) is non-negative, and the right side there is non-positive. Moreover, since L 2, if L 2 = 1 then exists an for which |α | + k = 1, and in that case the fact that |α | + k s 0 implies that either either L > 2 or (|α | + k ) < s 0 + 2. This shows that either the left side of (3.22) is strictly positive or the right side there is strictly negative, and hence that inequality indeed holds.
Summing over 0 k s 0 + 1 the estimates that we have obtained shows that for some continuous function c.
Finally, differentiating (1.1) with respect to t, taking the inner product of the result with 2u t , integrating over the spatial variables, integrating by parts in the terms that involve A j undifferentiated, and noting that the terms involving L or M drop out on account of the anti-symmetry of those operators yields where the first inequality follows in similar fashion to (3.12) and the second from for some continuous function c. By Lemma 3.5, ||||u|||| s 0 +1,ε,A 0 is bounded uniformly in δ and ε by M at time zero, so the differential inequality (3.24) shows that there is a fixed positive constant T such that max 0 t T ||||u|||| s 0 +1,ε,A 0 2M. Remark 3.7. 1. In the standard energy estimates for spatial derivatives of solutions of systems without large terms and of systems satisfying Klainerman-Majda balance, integrals of the form (3.14) not containing time derivatives are estimated using the Gagliardo-Nirenberg inequality with p = 2s |α| , a = |α| s , s s 0 , and s > |α| instead of (3.16). However, it is not possible to use (3.26) to estimate integrals involving second and higher time derivatives, because the boundedness of |||u||| s 0 +1,ε,A 0 does not imply even an ε-dependent bound for ∂ k t u L ∞ when k 2. 2. The special case of (3.14) and (3.16) in which p = 2, |α | + k = 2, p = 4, r = 1, a = d 4 , and d is either two or three so that s 0 = 2 was used previously in [2, §4.1 and Appendix]. 3. The expression εu t appears in the estimates for a purely spatial derivative D α of u, arising from the commutator term [D α , A 0 ]u t . When the spatial derivative terms in the PDE are at most O( 1 ε ) then substituting for εu t from the PDE yields a spatial derivative term of order one. Making this substitution allows spatial derivatives to be estimated without requiring estimates of time derivatives, both for systems without large terms and in the Klainerman-Majda theory. However, for the PDE (1.1) with the scaling (1.2) this procedure cannot be used because it yields terms of order ε δ , which is large. It is therefore necessary to leave the term εu t on the right side of the energy estimates for spatial derivatives of u, and this necessitates estimating time derivatives as well. Similarly, two-scale systems for which A 0 depends on u rather than εu also require estimates of time derivatives [1,12]. 4. In a similar fashion, a term containing εu tt appears in estimates for a spatial derivative of u t on account of the commutator term [D α ∂ t , A 0 ]u t . Assuming that u t is bounded initially but εu tt is large at time zero, this prevents us from obtaining an unweighted estimate for spatial derivatives of u t . The reason we do obtain an unweighted estimate for u t itself is that the commutator term [∂ t , A 0 ]u t does not yield any second time derivative. 5. The bound (3.8) on how fast δ can tend to zero compared to ε is only needed to ensure that the |||| |||| s 0 +1,ε,A 0 norm of the solution is uniformly bounded at time zero. The proof of Theorem 3.6 therefore also yields uniform bounds for a uniform time in the case when the time derivatives of the solution through order s 0 + 1 are uniformly bounded at time zero, without the need for assumption (3.8) and without using weights of powers of ε in the norms. In particular, taking ε ≡ 1 and letting δ → 0 yields a proof for arbitrary dimensions of the uniform existence theorem stated in [1] but only proven there in the case d = 1, for which no Gagliardo-Nirenberg estimates are needed.

A Finite-Dimensional Perturbation Result
We begin with a result on perturbations of self-adjoint matrices T (μ) := 1 μ p (T (0,0) + μT (0,1) ), where μ is a small parameter, and p is a positive integer. The result will be used in the proof of the convergence theorem in Section 4.2, where T (0,0) and T (0,1) will stand for the Fourier symbols of operators L and M respectively. The result says that there is an orthogonal projection P(μ) that commutes with T (μ), on whose range T (μ) is bounded uniformly and has a limit as μ → 0, and on whose null space T (μ) is bounded from below by a constant times 1 μ and has a finite expansion in inverse powers of μ.
where T (0,0) and T (0,1) are operators on a finite dimensional inner-product space X that are either both self-adjoint or both skew-adjoint, μ is a small parameter, and p is a positive integer. Then

There exists an orthogonal projection operator P(μ) that commutes with
T (μ) for μ = 0, is analytic in μ for real μ, and satisfies for 0 < μ < μ 0 , where X is the norm on the space X and μ 0 and the c j are positive constants. 2. For 0 j p−1 there exist commuting orthogonal projection operators P ( j) such that the ranges of the complementary projections I − P ( j) are mutually orthogonal subspaces, and Here  T ( p, p) .

The operator
where P (  1. These estimates show that (4.1), (4.2) hold. We carry out the reduction process of [6, §II.2.3] while choosing the unperturbed eigenvalue zero at every stage. However, we do not want to include the range of I − P ( j−1) when considering the zero eigenspace of T ( j, j) since that subspace has already been accounted for at previous stages of the reduction process. For this reason we replace the factor P ( j) (μ), which would appear in (4.4) if the corresponding formula [6, (2.37) in §II.2.3] were simply rewritten in our notation, with P ( j) (μ). This corresponds to the suggestion in [6, §II.2.3] to add a constant multiple of I − P ( j−1) to T ( j, j) but without the need to modify that operator. This procedure yields (4.4).
Since T (μ), after multiplication by i if necessary, is self-adjoint for all μ, formula (4.4) implies that all the T ( j,k) are then also self-adjoint after that multiplication has been done if necessary. If at any stage of the reduction process the new unperturbed operator T ( j+1, j+1) does not have any zero eigenspace except for the range of (I − P ( j) ), then P ( j+1) and hence also P(0) is identically zero, and if j + 1 < p − 1 then T (k) (μ) is simply the zero operator for j + 1 < k p − 1. By the construction of the reduction process, I − P ( j) (μ) is the orthogonal projection onto the direct sum of the eigenspaces of T (μ) whose eigenvalues are of size O(μ j− p ). Since the eigenvalues for different values of j are distinct for small enough μ, and T (μ) is self-adjoint, the ranges of I − P ( j) (μ) for different values of j are orthogonal to each other. This implies that the I − P ( j) (μ) for different j commute with each other, and hence so do the P ( j) (μ). Since I − P(μ) is the orthogonal projection onto the union over 0 j p − 1 of the eigenspaces of T (μ) whose eigenvalues are of size O(μ j− p ), and those eigenspaces are orthogonal for distinct j, (4.8) In addition, since the (I − P ( j) (μ)) project onto mutually orthogonal subspaces, (I − P ( j 1 ) (μ))(I − P ( j 2 ) (μ)) = 0 for j 1 = j 2 , which implies that + terms with at least two distinct factors (I − P ( j k ) (μ)) (4.9) Since the eigenvalues of size O(1) of T ( j) (μ) are the perturbations of the nonzero eigenvalues of T ( j, j) , the continuity of the projections P ( j) (μ) shows that as μ tends to zero the orthogonal projection I − P ( j) (μ) onto the direct sum of the eigenspaces of eigenvalues of T ( j) (μ) that are O(1) tends to the orthogonal projection I − P ( j) onto the direct sum of the eigenspaces of eigenvalues of T ( j, j) that are nonzero. This shows that in the strong (finite-dimensional) operator topology, which is isometric to a suitably normed matrix space. Therefore, taking the limit of (4.8), (4.9) and rearranging yields (4.3). Taking the limit of the identities , and the orthogonality of the ranges of I − P ( j) (μ) imply the orthogonality of the ranges of I − P ( j) . This also shows that P ( j) is the orthogonal projection onto the null space of T ( j, j) , where the T ( j, j) are the first terms in the expansions (4.4).
Since P(μ) is the orthogonal projection onto the direct sum of the eigenspaces of T (μ) that are O(1) or o(1), continuing the reduction process one more step yields (4.5).
The formulas for the T ( j, j) are obtained by using recursively formula [6, (2.18) in §II.2.2], which in our notation becomes, for the case here in which there are no nilpotents, In particular, for j = 0 and n = 1 only the term with r = 1 is present in the outer sum in (4.11), and the inner sum then contains only the case where ν 1 = 1 and k 1 = 0 = k 2 . Using (4.12), this yields (4.6). An analogous but longer calculation yields (4.7).  (4.11) shows that in order to calculate T (2,2) it is necessary to first calculate T (1,1) and T (1,2) , while in order to calculate T (3,3) it would be necessary to first calculate T (1, j) for 1 j 3 and then T (2, j) for 2 j 3. . If = 0 then T (0,0) and T (2,2) each have one nonzero eigenvalue so the fact that the matrices are of size 2×2 implies that T ( j, j) = 0 for j > 2, while if = 0 then T ( j, j) = 0 for j 2. When k = 0 but is nonzero then T (0,0) = 0, P (0) = I, T (1,1) = T (0,1) , P (1) = 0, and T ( j, j) = 0 for j > 1, while when both k and vanish then, for all j, T ( j, j) = 0 and P ( j) = I . 2. The operators L and M in (1.1) are allowed to have order zero, that is, to be simply multiplication by fixed matrices, and then the operators in the lemma are simply the same operators. For example, For these operators,

Theorem and Proof
The following projections and operator will appear in the statement and proof of the convergence theorem. We assume that either (1.6) or (1.7) holds for some integer s s 0 .
Let  Remark 4.5. Since P(k) is an orthogonal projection for each k and hence bounded by one, P is an orthogonal projection on L 2 and a bounded operator on H s for all s. In contrast, although T lim (k) is a bounded operator for each k the operator T lim may be unbounded. By Lemma 4.1, T lim (k) is skew-adjoint for each k so T lim is anti-symmetric. Theorem 4.6. Assume that the conditions of Theorem 3.6 hold, that δ and ε tend to zero while obeying either (1.6) or (1.7) for some integer s s 0 , and that u 0 (x, δ, ε) converges in H s 0 +1 to u 0,0 (x) in that limit.
Proof. The uniform bound for the |||| |||| s 0 +1,ε,A 0 norm of the solution of (1.1), proven in Theorem 3.6 shows that max 0 t T [ u(t, ·) 2 s 0 +1 + u t In particular, this convergence together with the assumption on the convergence of the initial data show that (4.16) holds.
By interpolation between Sobolev spaces, the convergence and bounds obtained so far imply that the subsequence also converges to U in C 0 ([0, T ]; H s 0 +1−μ ) for any μ > 0, and hence also in C 0 ([0, T ]; C 1 ). This yields the convergence in at least Now apply the projection P(μ) from Definition 4.4 with μ = δ ε to the PDE (1.1), which yields As noted above, the expression in brackets on the right side of (4.17) converges in C 0 ([0, T ]; L 2 ) as δ and ε tends to zero in the manner stated in the theorem.
Since μ := δ ε tends to zero in that limit, the projection P( δ ε ) converges in the strong operator topology to P in that limit since the Fourier transform of the former is uniformly bounded and converges pointwise to the Fourier transform of the latter, so for any f ∈ L 2 , [P(μ) − P] f 2 L 2 = |[ P(μ)(k) − P(k)] f (k)| 2 dk (or that expression with the integral replaced by a sum if the spatial domain is periodic) tends to zero by (4.10), the Dominated Convergence Theorem and the fact that orthogonal projection operators do not increase vector length. Hence the entire right side of (4.17) converges in the above limit to P A 0 (0)U t − j A j (U )U x j − F(t, x, U ) . This implies that the left side of (4.17) also converges.
When (1.6) holds then that relation plus the definition p = s + 1 from Definition 4.4 imply that μ p δ = C p−1 (1 + o(1)). Hence Lemma 4.1 shows that (1). (4.18) Although the Fourier transform of T lim may be unbounded as a function of the Fourier transform variable, (4.18) together with the convergence of u to U shows that the Fourier transform of the left side of (4.17) converges pointwise to the Fourier transform of T lim U . The fact that that left side is known to converge in L 2 implies that its Fourier transform also converges in L 2 . Since the pointwise and L 2 limits of a sequence of functions must coincide when both exist, the Fourier transform of the left side of (4.17) tends in L 2 to the Fourier transform of T lim U , and hence that left side tends to T lim U . The reduction process also shows that the Fourier transform of T lim is in the image of P(k) for each k, so rearranging the limit of (4.17) yields (4.14). When (1.7) holds instead of (1.6) then that relation plus the definition p = s + 2 from Definition 4.4 imply that μ p δ = o(1), so (4.18) holds with C replaced by zero, and again leads to (4.14) but with T lim = 0.
Since the final expression in (4.20) looks like the L 2 estimate for a symmetric hyperbolic system, we obtain (1) ) + M(t, x, U (2) , ∇ x U (2) ) for some K 1 and K 2 depending on the H s 0 +1 norms of U (1) and U (2) and the constant c 0 from (3.3). Estimate (4.21) plus the initial condition (4.16) imply that 1 2 U, A 0 (0)U ( 1 2 U (0), A 0 (0)U (0) )e kt = 0, which implies that U ≡ 0, that is, U (1) ≡ U (2) , yielding uniqueness. As usual, the uniqueness of the limit implies that convergence holds as δ and ε tend to zero while satisfying (1.6) or (1.7) without restricting to a subsequence. The relationship δ = ε 2 does not satisfy (1.3) in dimension two. Nevertheless, as noted in the introduction, the fact that the coefficient matrix of the time derivatives does not depend on u or v implies that solutions of (4.22) satisfy uniform bounds. Let f (x, y) be a function whose gradient belongs to H 3 , and take the initial data to be u(0, x, y) = u 0 (x, y) := −ε f y and v(0, x, y) = v 0 (x, y) := f x . Then u t (0, x, y) = 0 and v t (0, x, y) = f yy , that is, the initial time derivative is bounded. Since the PDE is linear with constant coefficients, it is convenient to express the limit equation in Fourier space. By part 1 of Example 4.3, when k = 0 then the limit is U (t, k, ) = 0, V t − i 2 k V = 0, while for k = 0 but = 0 the limit is U (t, 0, ) = 0 = V (t, 0, ) and for k = 0 = the limit is U t (t, 0, 0) = 0 = V t (t, 0, 0). The initial data for the limit are U (0, k, ) = 0 and V (0, k, ) = ik f (k, ). When k and are both nonzero the solution of the limit equation is U (t, k, ) = 0 and V (t, k, ) = ike i while when k = 0 then the limit is U = 0 = V . When the spatial domain is R 2 then T lim (k) = −i 2 k is unbounded but when the domain is T 2 then it is bounded since |k| c on the set where it is nonzero. Even when the spatial domain is R 2 , the fact that V (t, k, ) contains a factor of k ensures that V t is bounded, but V tt will be unbounded if f (0, 0) = 0. The limit solution (4.23), which implies the limit equation satisfied by V , can be verified by solving the equation for U and V exactly for k = 0. This yields +O(ε), whose limit as ε → 0 indeed yields (4.23). 2. Adding the term −α 1 0 0 1 ( u v ) y to (4.22) changes the limit equation for V to V t − i 2 k−α V = 0. If α is irrational but well-approximated by rationals then the term T lim (k) = −i 2 k−α may not be bounded by (|k|+| |) 3 as that expression tends to infinity, even in the periodic case, so T lim may not be a bounded operator from H 3 to L 2 . 3. Consider the PDE u t + u x + 1 δ L u + 1 ε M u = 0, where δ = ε 3/2 and L and M are the matrices discussed in Part 2 of Example 4.3. Since the choice of the relationship between δ and ε makes s in (1.6) equal two and hence p in Definition 4.4 equal three, the projection P is orthogonal to the non-zero eigenspace of T (2,2) as well as those of T (0,0) and T (1,1) . The formula for T (2,2) in Part 2 of Example 4.3 therefore shows that when m = 0 and ad −bc = 0 then the limit equation is simply U = 0 while when m = 0 but ad − bc = 0 then the limit equation is that the first three components of U vanish and ∂ t + ∂ x of its last two components equal zero. This shows that even the number of nonzero components of the limit cannot be determined simply by looking at the number of components that do not contain large terms nor even by first eliminating all components having terms of order 1 δ and then eliminating those remaining components having terms of order 1 ε not coming from components already eliminated, which works for the system (4.22).