Operators Arising as Second Variation of Optimal Control Problems and Their Spectral Asymptotics

We compute the asymptotic for the eigenvalues of a particular class of compact operators deeply linked with the second variation of optimal control problems. We characterize this family in terms of a set of finite dimensional data and we apply this results to a particular class of singular extremal to get a nice description of the spectrum of the second variation.


Introduction
The main focus of this paper is the study of a particular class of compact operators K on the Hilbert space L 2 ([0, 1], R k ) with the standard Hilbert structure.They are characterized by the following properties: • there exists a finite dimensional subspace of L 2 ([0, 1], R k ), which we call V, on which K becomes a self-adjoint operator, i.e. : • K is an Hilbert-Schmidt operator with an integral kernel of a particular form, namely: Where V (t, τ ) is a matrix whose entries are L 2 functions.We call the class of operator satisfying this last condition Volterra-type operators.
The main results of this paper are a fairly general study of the asymptotic distribution of the eigenvalues of K when restricted to any subspace V which satisfies eq. ( 1) (Theorem 1) and a characterization result for operators satisfying the two properties stated above (Theorem 2).
The first result is proved in Section 2. We first restrict ourself to operators K of the form: Here Z t is an analytic in t, 2n × k matrix and σ the standard symplectic form on R 2n (see Remark 1).A similar asymptotic formula was proved in [3, Theorem 1], it was shown that if we consider {λ n ( K)} n∈Z the decreasing (resp.increasing) arrangement of positive (resp.negative) eigenvalues of K we have either: for n ∈ Z sufficiently large and for some ξ > 0. The number ξ is called capacity and depends only on the matrix Z t in the definition of K.
If ξ = 0, we go further with the expansion in eq. ( 4).We single out the term giving the principal contribution to the asymptotic representing the quadratic form associated to K as: The result mentioned above corresponds to the case Q 1 = 0, in Theorem 1 we give the asymptotic for the general case.
From the point of view of geometric control theory Theorem 1 can be seen as an asymptotic analysis of the spectrum of the second variation for particular classes of singular extremals and a quantitative version of some necessary optimality conditions.Precise definitions will be given in Section 4, standard references on the second variation are [7,Chapter 20] and [1].For now it is enough to know that the second variation Q of an optimal control problem on a manifold M is a linear operator on L 2 ([0, 1], R k ) of the following form: where H t is a symmetric k × k matrix, σ is the standard symplectic form on T η T * M and Z t : R k → T η (T * M ) is a linear map with values in the tangent space to a fixed point η ∈ T * M .For totally singular extremal, the matrix H t appearing in eq. ( 5) is identically zero and the second variation reduces to an operator of the same form as in eq.(3).
In Section 3 we prove Theorem 2. We first show that any K satisfying eqs.(1) and (2) it is completely determined by its (finite rank ) skew-symmetric part A and can always be represented as in eq. ( 3).Then we relate the capacity of K to the spectrum of A.
In Section 4 we recall some basic notions from control theory and we reformulate Theorem 2 in a more control theoretic fashion, and use it to characterize the operators coming form the second variation of an optimal control problem.Moreover we give a geometric interpretation of the capacity ξ appearing in eq. ( 4) in terms of the Hessian of the maximized Hamiltonian coming from Pontryagin Maximum Principle.

Overview of the main results
We begin this section recalling some general facts about the spectrum of compact operators, then we fix some notation and give a precise statement of the main results.Given a compact self-adjoint operator K on an Hilbert space H, we can define a quadratic form setting Q(v) = v, K(v) .The eigenvalues of Q are by definition those of K and we will denote Σ ± (Q) the positive and negative parts of the spectrum of Q.
By the standard spectral theory of compact operators (see [12]) the non zero eigenvalues of K are either finite or accumulate at zero and their multiplicity is finite.Consider the positive part of the spectrum of Q, Σ + (Q) and λ ∈ Σ + (Q).Denote by m λ the multiplicity of the eigenvalue λ.We can introduce a monotone non increasing sequence {λ n } n∈N indexing the eigenvalues of K, requiring that the cardinality of the set {λ n = λ} = m λ for every λ ∈ Σ + (Q).
This will be called the monotone arrangement of Σ + (Q).We can perform the same construction indexing by −n, n ∈ N, the negative part of the spectrum Σ − (Q).This time we require that the sequence {λ −n } n∈N is non decreasing.Provided that Σ ± (Q) are both infinite, we obtain a sequence {λ n } n∈Z .Definition 1.Let Q be a quadratic form Q on a Hilbert space H and j ∈ N • if j is odd, Q has j−capacity ξ > 0 with reminder of order ν > 0 if Σ + (Q) and Σ − (Q) are both infinite and: • if j is even, Q has j−capacity (ξ + , ξ − ) of order ν > 0 if both Σ + (Q) and Σ − (Q) are infinite and: where ξ ± ≥ 0 or if at least one between Σ + (Q) and Σ − (Q) is infinite and the relative monotone arrangement satisfies the corresponding asymptotic relation; • if the spectrum is finite or λ n = O(n −ν ) as n → ±∞ for any ν > 0, we say that Q has ∞−capacity.
The behaviour of the sequence {λ n } n∈Z is closely related to the following counting functions: The requirement of definition 1 for the j−capacity can be translated into the following asymptotic for the functions C ± j (n): We illustrate here some of the properties of the j−capacity.The proofs are given in section 2, Proposition 3. Without loss of generality we state the properties for the positive part of the spectrum, analogue results hold for the negative one.
• (Homogeneity) if Q 1 and Q 2 are quadratic forms on two Hilbert spaces H 1 and H 2 of j−capacity ξ 1 and ξ 2 respectively with the same remainder ν, then aQ 1 has j−capacity aξ 1 and the sum • (Independence of restriction) If V ⊆ H is a subspace of finite codimension then Q has j−capacity ξ with remainder ν if and only if its restriction to V has j−capacity ξ with remainder ν. • (Additivity) if Q 1 has j−capacity ξ with remainder ν and Q 2 has 0 j−capacity with remainder of the same order ν, then their sum Q 1 + Q 2 has the same capacity with remainder ν ′ = (j+ν)(j+1) j+ν+1 In the remaining part of this section will be dealing with quadratic forms Q coming from operators of the form given in eq. ( 3).Suppose that Z t is a 2n × k matrix which depends piecewise analytically on the parameter t ∈ [0, 1] and define the following 2n × 2n skew-symmetric matrix: As Q consider the following quadratic form on L 2 ([0, 1], R k ): Remark 1.The operator K and the bilinear form Q(u, v) = u, K(v) are not symmetric.However the operator: satisfies eq. ( 1) and becomes symmetric on a finite codimension subspace V.It is enough to require that the integral 1 0 Z t v(t)dt lies in a Lagrangian subspace of (R 2n , σ) for any v ∈ V.For instance if we consider the fibre (or vertical subspace), i.e. the following: Here σ denotes the standard symplectic form on R 2n defined as σ(x, x ′ ) = Jx, x ′ .Let f be a smooth function on [0, 1] and let k ∈ N, denote by the k−th derivative with respect to t.For j ≥ 1 define the following matrix valued functions: We use ρ t to denote any eigenvalue of the matrix A j (t).If j = 2k, define: For odd indices, A 2k−1 is skew-symmetric and thus the spectrum is purely imaginary.So we define the function: We are now ready to state the first main result of the section.
Theorem 1.Let Q be the quadratic form in eq.(7).Q has either ∞−capacity or j−capacity with remainder of order ν = 1/2.More precisely, let j ≥ 1 be the lowest integer such that A j (t) is not identically zero, then • if j = 2k − 1, the (2k − 1)−capacity ξ is given by: and thus for n ∈ Z sufficiently large: ).
• if j = 2k, the 2k−capacity (ξ + , ξ − ) is given by: and thus for n ∈ Z sufficiently large: Remark 2. It is worth remarking that in Theorem 1 of [3] the order of the remainder for the 1−capacity was a little better, 2/3 and not 1/2.The proof of this result is given in Section 2. The next theorem gives a characterization of the operators satisfying eqs.( 1) and ( 2) and a geometric interpretation of the 1−capacity.Before going to the statement let us introduce the following notation.Let A denote the skew-symmetric part of K: Let Σ be the spectrum of A and Im(A), the image of A.
Theorem 2. Let be K an operator satisfying eq.(1) and eq.(2).Then A has finite rank and completely determines K.More precisely, if A has rank 2m and is represented as: for a skew-symmetric 2m × 2m matrix A 0 and a 2m × k matrix Z t then: Let Σ be the spectrum of A, if the matrix Z t can be chosen to be piecewise analytic the 1−capacity of K can be bound by 2 Proof of Theorem 1 Before going to the proof of Theorem 1 we still need some auxiliary results.
We start with Lemma 1 to single out the main contributions to the asymptotic of the eigenvalues of Q (the quadratic form defined in eq. ( 7)).The first non zero term of the decomposition we give will determine the rate of decaying of the eigenvalues (see Proposition 4).
Before showing this and prove the precise estimates we need to carry out the explicit computation of the asymptotic in some model cases, namely when the matrices A j are constant.Then we have to show how the j−capacity behaves with respect to natural operations such as direct sum of quadratic form or restriction to finite codimension subspaces (Proposition 3).
Let us start with some notation: Suppose that the map t → Z t is real analytic (or at least regular enough to perform the necessary derivatives) and integrate by parts twice: If we impose the condition 1 0 v t dt = 0 ( ⇐⇒ v 1 (1) = 0) the term in brackets vanishes: and we can write Q as a sum of three terms In analogy we can make the following definitions: Here the matrices A j (t) are exactly those defined in eq. ( 9).
Lemma 1.For every j ∈ N, on the subspace V j , the form Q can be represented as The matrices A 2k (t) are symmetric provided that d dt A 2k−1 (t) ≡ 0. On the other hand A 2k−1 is always skew symmetric.
Proof It is sufficient to notice that R 1 (v) has the same form as Q(v) but with v 1 instead of v and Żt instead of Z t .Thus the same scheme of integration by parts gives the decomposition.
Notice that A 2k (t) = A * 2k (t)+ d dt A 2k−1 (t) thus the skew-symmetric part of A 2k (t) is zero if A 2k−1 is zero or constant.A 2k−1 (t) is always skew-symmetric by definition.Now we would like to compute explicitly the spectrum of the Q j when the matrices A j are constant.Unfortunately describing the spectrum with boundary conditions given by the V j is quite hard.Already for Q 4 the equation determining it cannot be solved explicitly.
We will derive the Euler-Lagrange equation for Q j and turn instead to periodic boundary conditions for which everything becomes very explicit and show how to relate the solution for the two boundary value problems we are considering.Let us write down the Euler-Lagrange equations for the forms Q j .If j = 2k integration by parts yields: Notice that the boundary terms vanish identically if we impose the vanishing of v j for 1 ≤ j ≤ k at boundary points.
We change notation and define w(t) = v 2k (t) and w (j) (t) = d j dt j (w(t)).The new equations are: We can perform a linear change of coordinates that diagonalizes A 2k to reduce to m 1−dimensional systems.Imposing periodic boundary conditions, we are thus left with the following boundary value problem: The case of odd j is very similar, in fact Q 2k−1 (v) can be rewritten as: Here by b.t.we mean boundary terms as the one appearing in the previous equation.They again disappear if we assume that v j ∈ V j .Thus we end up with a boundary value problem similar to the one we had before with the difference that now the matrix A 2k−1 is skew-symmetric.
If we split the space into the kernel and invariant subspaces on which A 2k−1 is non degenerate we can decompose Q 2k−1 as a direct sum of two-dimensional forms.Imposing periodic boundary conditions, we end up with the following boundary value problems: Lemma 2. The boundary value problem in eq. ( 12) has a solution if and only if λ ∈ µ (2πr) 2k : r ∈ N .Moreover any such λ has multiplicity 2. In particular, the decreasing sequence of λ for which eq. (12) has solutions satisfies: Similarly the boundary value problem in (13) has a solution if and only if: and any such λ has again multiplicity 2. The monotone rearrangement of λ for which there exists a solution to the boundary value problem is: w(t) can be expressed as a combination of trigonometric and hyperbolic functions with the appropriate frequencies.
Without loss of generality we can assume µ > 0, we have to consider two separate cases: Case 1: k even and λ > 0 or k odd and λ < 0 In this case the quantity (−1) k µλ −1 > 0. If we define a 2k = (−1) k µλ −1 > 0 for a > 0, we have to solve: A base for the space of solutions to the ODE is then {e ω j at : ω = e iπ/k }.For us it will be more convenient to switch to a real representation of the space of solutions.Notice the following symmetry of the even roots of 1, if η is a root of 1 different form ±1, ±i then {η, η, −η, −η} are still distinct roots of 1 (this is also a Hamiltonian feature of the problem).
If we write η = η 1 + iη 2 , this symmetry implies that the space generated by {e ηt , e ηt , e −ηt , e −ηt } is the same as the space generated by Let us rescale these functions by a (so that they solve eq. ( 14)) and call their linear span Uη, we then define U 1 to be the span of {sinh(t), cosh(t)} and U i = {sin(t), cos(t)}.Note that U i appears if and only if k is even.
Thus the solution space for our problem is the space η Uη where η ranges over Now we have to impose the boundary conditions.Notice that, if k is even then U i is made of periodic functions, so they are always solutions.We can look for more on the complement η =i Uη.Suppose by contradiction that w is one of such solutions.Write w = η wη with wη ∈ Uη and let b be the sup{ℜ(η) : η ∈ E, wη = 0}.It follows that either sinh(b at) or cosh(b at) is present in the decomposition of w.It follows that: and so |w| is unbounded as t → +∞ (or −∞) and thus w is not periodic.It follows that there are periodic solutions only if k is even (and thus λ > 0) and a = 2πr = 2k µ λ .Notice that we have two independent solutions, so if we order the solution decreasingly we have: λr = µ (2π⌈r/2⌉) 2k , r ∈ N Case 2: k odd and λ > 0 or k even and λ < 0 In this case we have to look at the roots of −1 but the argument is very similar.If k is even there are no solutions, since you lack purely imaginary frequencies.If k is odd, set |µλ −1 | = a 2k , then the boundary value problem is: The roots of −1 are just the roots of 1 rotated by i.Now the space of solutions is η =1 Uη.We find again two independent solutions, if we order them we get: Notice that positive µ gives rise to positive solutions.Thus if we consider µ < 0, we get the same result but with switched signs.
We can reduce the odd case (eq.( 13)) to the even one.Consider the 1−dimensional equation of twice the order, i.e.: λ 2 w 1 Now, the discussion above tells us that there are exactly two independent solutions with periodic boundary conditions whenever λ satisfies 2k−1 µ |λ| = 2rπ.It follows that again there are two independent solutions, this times for both signs of λ.If we order them we get: Let µ > 0 and s ∈ (0, +∞), denote by η s the number of solutions of eq. ( 12) with λ greater than s and similarly denote by ω s be the number of solutions with λ bigger than s of: The same conclusion holds for eq.(13).
Proof The result follows from standard results about Maslov index of a path in the Lagrange Grassmannian.References on the topic can be found in [6,5,2].Let us illustrate briefly the construction.Let (Σ, σ) be a symplectic space, the Lagrange Grassmannian is the collection of Lagrangian subspaces of Σ and it has a structure of smooth manifold.For any Lagrangian subspace L 0 we define the train of L 0 to be the set: T L0 is a stratified set, the biggest stratum has codimension 1 and is endowed with a co-orientation.If γ is a smooth curve with values in the Lagrangian Grassmannian (i.e. a smooth family of Lagrangian subspaces) which intersects transversally T L0 in its smooth part, one defines an intersection number by counting the intersection points weighted with a plus or minus sign depending on the co-orientation.Tangent vectors at a point L of the Lagrange Grassmannian (which is a subspace of Σ) are naturally interpreted as quadratic forms on L. We say that a curve is monotone if at any point its velocity is either a non negative or a non positive quadratic form.For monotone curves, Maslov index counts the number of intersections with the train up to sign.For generic continuous curves it is defined via a homotopy argument.Denote by Mi L0 (γ) the Maslov index of a curve γ and L 1 be another Lagrangian subspace.In [2] the following inequality is proved: Let us apply this results to our problem.First of all let us produce a curve in the Lagrange Grassmannian whose Maslov index coincides with the counting functions ωs and ηs.The right candidate is the graph of the fundamental solution of w (2k) (t) = (−1) k µ λ w(t).We write down a first order system on R 2k equivalent to our boundary value problem, if we call the coordinates on R 2k x j , set: For simplicity call = a, the matrix we obtain has the following structure: This matrix is not Hamiltonian with respect to the standard symplectic form on R 2k but is straightforward to compute a similarity transformation that sends it to an Hamiltonian one (recall that we already used that A λ has the spectrum of an Hamiltonian matrix).Moreover the change of coordinates can be chosen to be block diagonal and thus preserves the subspace B = {x j = 0, k ≤ j}, which remains Lagrangian too.Since later on we will have to show that the curve we consider is monotone we will give this change of coordinates explicitly.Define the matrix S setting S i,k−i+1 = (−1) i−1 and zero otherwise.It is a matrix that has alternating ±1 on the anti-diagonal.Define the following 2k × 2k matrices: Set N to be the lower triangular k × k shift matrix (i.e. the left upper block of A λ above) and E the matrix with just a 1 in position (1, k) (i.e. the left lower block of A λ ).The new matrix of coefficients is: Now we are ready to define our curve.First of all the symplectic space we are going to use is (R 4k , σ ⊕ (−σ)) where σ is the standard symplectic form, in this way graphs of symplectic transformation are Lagrangian subspaces.Sometimes we will denote the direct sum of the two symplectic forms with opposite signs with σ ⊖ σ too.Let Φ λ be the fundamental solution of Φt λ = Âλ Φ t λ at time t = 1.Consider its graph: Once we prove that γ is monotone, is straightforward to check that Mi B×B (γ| [s,+∞) ) counts the number of solutions to boundary value problem given in eq. ( 15) for λ ≥ s and similarly Mi Γ(I) (γ| [s,+∞) ) counts the solutions of eq. ( 12) for λ ≥ s.Here Γ(I) stands for the graph of the identity map (i.e. the diagonal subspace).
Let us check that the curve is monotone.As already mentioned, tangent vectors in the Lagrange Grassmannian can be interpreted as quadratic forms.Being monotone means that the following quadratic form is either non negative or non positive: We use the ODE for Φ λ (t) to prove monotonicity: Where we used the facts that ∂ λ Φ 0 λ = ∂ λ Id = 0 and that Âλ is Hamiltonian and thus J Âλ = − Â * λ J to cancel the first and third term.It remains to check J∂ λ Âλ .It is straightforward to see that it is a diagonal matrix with just a non zero entry, thus is either non negative or non positive.So ∂ λ γ is either non positive or non negative being the integral of a non positive or non negative quantity (the sign is independent of ξ).Now the statement follows from inequality (16).
We are finally ready to compute the asymptotic for Q j when the matrix A j is constant.The next Proposition translate the estimate on the counting functions η s and ω s defined in Proposition 1 to an estimate for the eigenvalues.
Proposition 2. Let Q j be any of the forms appearing in eq.(11).
symmetric and constant and let Σ 2k be its spectrum.Define Then Q 2k has capacity (ξ + , ξ − ) with remainder of order one.Moreover, if where p(r) = 0 if r is even or p(r) = 1 if r is odd.Similarly for negative r with ξ − .
• Suppose j = 2k + 1 and skewsymmetric and constant and let Σ 2k+1 be its spectrum.Define Then Q 2k+1 has capacity ξ with remainder of order one.Moreover , if Proof First of all we consider 1−dimensional system and we write the inequality |ηs − ωs| as an inequality for the eigenvalues.Notice that if we have two integer valued function f, g : R → N and an inequality of the form: it means that we have at least f (s) solutions bigger than s and at most g(s).This implies that the sequence of ordered eigenvalues satisfies: Now we compute this quantities explicitly.In virtue of Proposition 1 we can take as upper/lower bounds for the counting function g(s) = ηs + 2k and f (s) = ηs − 2k.
We choose the point s = µ (2πr) j .It is straightforward to see that: And thus we obtain:

Now if we change the labelling we find that , for
By definition λ 2l ≥ λ 2l+1 ≥ λ 2l+2 and thus we have a bound for any index r ∈ N. Now we consider m−dimensional system, notice that we reduced the problem, via diagonalization, to the sum of m 1−dimensional systems.Thus our form Q j is always a direct sum of 1− dimensional objects.We show now how to recover the desired estimate for the sum of quadratic forms.
First of all observe that counting functions are additive with respect to direct sum.In fact, if Q = ⊕ m i=1 Q i , λ is an eigenvalue of Q if and only if it is an eigenvalue of Q i for some i.We proceed as we did before.Suppose that Qa is 1−dimensional and √ µi) , it is straightforward to see that the cardinality of the above set is #{r ∈ N : r ≤ cal} = ⌊cal⌋.Now we are ready to prove the estimates for the direct sum of forms.Adding everything we have: It is clear that Rewriting for the eigenvalues with l ≥ mk we obtain: It is straightforward to compute the bounds in eqs.( 17) and (18) observing again Remark 3. The shift m appearing in eqs.( 17) and ( 18) is due to the fact we are considering the direct sum of m quadratic forms.It is worth noticing that this does not depend on the fact that we are considering a quadratic form on L 2 ([0, 1], R m ) and the estimates in eqs.( 17) and ( 18) hold whenever we consider the direct sum of m 1−dimensional forms with constant coefficients.This consideration will be used in the proof of Theorem 1 below.Now we prove some properties of the capacities which are closely related to the explicit estimate we have just proved for the linear case.As done so far we state the proposition for ordered positive eigenvalues.An analogous statement is true for the negative ones.Proposition 3. Suppose that Q is a quadratic form on an Hilbert space and let {λ n } n∈N be its positive ordered eigenvalues.Suppose that: 1. Then for any such Q i on a Hilbert space as n → +∞. 3. Suppose that Q and Q are two quadratic forms.Suppose that Q is as at the beginning of the proposition and Q satisfies: Then the sum Proof The asymptotic relation can be written in terms of a counting function.Take the j−th root of the eigenvalues of Q i , then it holds that So summing up all the contribution we get the estimate in i).
The min-max principle implies that we can control the n−th eigenvalue of Q| U with the n−th and (n + d)-th eigenvalue of Q i.e.: So, if the codimension is fixed, it is equivalent to provide and estimate for the eigenvalues Q or for those of Q| U .
For the last point we use Weyl law.We can estimate the i + j-th eigenvalue of a sum of quadratic forms with the sum of the i−th and the j-th eigenvalues of the summands.Write, as in [3], Q ′ as Q+ Q and Q as Q ′ +(− Q). and choose i = n − ⌊n δ ⌋ and j = ⌊n δ ⌋ in the first case and i = n and j = ⌊n δ ⌋ in the second.This implies: The best remainder is computed as ν ′ = max δ∈(0,1) min{(j + µ)δ, j + 1 − δ, j + ν}.
Collecting all the facts above we have the following estimate on the decaying of the eigenvalues of Q j , independently of any analyticity assumption of the kernel.
Proposition 4. Take Q j as in the decomposition of lemma (1).Then the eigenvalues of Q j satisfy: Moreover for any k ∈ N and for any 0 ≤ s ≤ k the forms Q 2k+1 and Q 2k have the same first term asymptotic as the forms: Proof Let's start with even case, j = 2k.It holds that: Where C = max t ||A t ||.By comparison with the constant coefficient case we get the bound.Suppose now that j = 2k − 1.As before there is a constant C such that Consider now the following quadratic forms on L 2 ([0, 1], R k ): Define Vn = {v 1 , . . ., vn} ⊥ where v i are linearly independent eigenvectors of F k associated to the first n eigenvalues λ 1 ≥ • • • ≥ λn.Similarly define Un = {u 1 , . . ., un} ⊥ to be the orthogonal complement to the eigenspace associated to the first n eigenvalues of F k+1 .It follows that: We already have an estimate for the eigenvalues of F k and F k+1 since we have already dealt with constant coefficients case.In virtue of the choice of the subspace Vn and Un, the maxima in the right hand side are the square roots of the n − th eigenvalues of the respective forms.Thus one gives a contribution of order n −k and the other of order n −k−1 and the first part of the proposition is proved.
For the second part, without loss of generality suppose that j = 2k.The other case is completely analogous.
The second term above is of higher order by the first part of the lemma and so iterating the integration by parts on the first term at step s we get that: The second term of the right hand side is again of order n 2k+1 , this can be checked in the same way as in the first part of the proposition.This finishes the proof.

Now we prove the main result of this section:
Proof of Theorem 1 Suppose that j = 2k is even.We work on Since the matrix A t is analytic we can diagonalize it piecewise analytically in t (see [11]).Thus there exists a piecewise analytic orthogonal matrix O t such that O * t A t O t is diagonal.By the second part of Proposition 4, if we make the change of coordinates v t → O t v t we can reduce to study the direct sum of m 1− dimensional forms.Without loss of generality we consider forms of the type: where now a t is piecewise analytic and v k a scalar function.
For simplicity we can assume that a t does not change sign and is analytic on the whole interval.If that were not the case, we could just divide [0, 1] in a finite number of intervals and study Q 2k separately on each of them.
Suppose you pick a point t 0 in (0, 1) and consider the following subspace of codimension mk in V k : k the form Q 2k splits as a direct sum: Now by Proposition 3 (points i) and ii)) we can introduce as many points as we want and work separately on each segment and the asymptotic will not change (as long as the number of point is finite).Now we fix a partition Now, we already analysed the spectrum for the problem with constant a t on [0, 1].The last step to understand the quantities on the right and left hand side is to see how the eigenvalues rescale when we change the length of [0, 1].
If we look back at the proof of Lemma 2, it is straightforward to check that the length is relevant only when we impose the boundary conditions, we find that the eigenvalues are: λ = aℓ 2k (2πn) 2k and again double.In particular the estimates in eqs.( 17) and (18) are still true replacing µ i with a ± i ℓ 2k .If we replace now ℓ by |t i+1 −t i | and sum the capacities according to Proposition 3 we have the following estimate on the eigenvalues on V Π , for n ≥ 2k|Π|: Moreover the min-max principle implies that, for n ≥ k|Π|: In particular for n ≥ 3k|Π| we have: We address now the issue of the convergence of the Riemann sums.Set It is well know that I ± a → Ia as long as sup i |t i − t i+1 | goes to zero.We need a more quantitative bound on the rate of convergence.Using results from [9] for and equispaced partition, we have that: Where C(a, k, ±) is a constant that depends only on the function a and on k and the inequality holds for |Π| ≥ n 0 sufficiently large, where n 0 depends just on a and k.
Consider the right hand side of eq. ( 19), adding and subtracting Ia (πn) 2k , we find that for n ≥ max{n 0 , k|Π|}: A simple algebraic manipulation shows that there are constants C 1 , C 2 and C 3 such that the difference on the right hand side is bounded by for n ≥ max{3k|Π|, n 1 |Π|, n 0 } where n 1 is a certain threshold independent of |Π|.
The idea now is to choose for n a partition Π of size |Π| = ⌊n δ ⌋ to provide a good estimate of λn(Q).The better result in terms of approximation is obtained for δ = 1 2 .Heuristically this can be explained as follows: on one hand the first piece of the error term is of order n −2k−δ , comes from the convergence of the Riemann sums and gets better as δ → 1.On the other hand the second term comes from the estimate on the eigenvalues and get worse and worse as n δ becomes comparable to n.
A perfectly analogous argument allows to construct an error function for the left side of eq. ( 19) which decays as n −2k−1/2 for n sufficiently large.
We have proved so far that, for one dimensional forms, Q 2k has 2k−capacity ξ + = ( 1 0 2k √ a t dt) 2k .Now we apply point i) of Proposition 3 to obtain the formula in the statement for forms on L 2 ([0, 1], R m ).Finally notice that by Proposition 4 the eigenvalues of R k (v) decay as n −2k−1 .If we apply point iii) of Proposition 3 we find that Q 2k (v) + R k (v) has the same 2k−capacity as Q 2k with remainder of order 1/2.Now we consider the case j = 2k−1.The idea is to reduce to the case of j = 4k−2 as in the proof of Lemma 2 and use the symmetries of Q 2k−1 to conclude.In the same spirit as in the beginning of the proof let us diagonalize the kernel A 2k−1 .We thus reduce everything to the two dimensional case, i.e. to the quadratic forms: and so the spectrum is two sided and the asymptotic is the same for positive and negative eigenvalues.Now we reduce the problem to the even case.Let's consider the square of Q 2k−1 .By proposition (4) Q 2k−1 has the same asymptotic as the form: So we have to study the eigenvalues of the symmetric part of F .It is clear that: Thus we have to deal with the quadratic form: The last term is the easiest to write, it is just: which is precisely of the form of point i) and gives 1  4 of the desired asymptotic.The operator F * acts as follows: Using integration by parts one can single out the term A t v 2k−1 .To illustrate the procedure, for k = 1 one gets: The other terms thus do not affect the asymptotic since by Proposition 4 they decay at least as O(n 3 ).The proof goes on the same line for general k.
The same reasoning applies to the term F (v), F * (v) .Summing everything one gets that the leading term is dt and so this is precisely the same case as point i).Recall that A t is a 2 × 2 skew-symmetric matrix as defined in eq. ( 20), thus the eigenvalues of the square coincide and are a 2 t .It follows that, for n sufficiently large, the square of the eigenvalues of Q satisfy: It is immediate to see that . This mirrors the fact that the spectrum of Q 2k−1 is double and any couple λ, −λ is sent to the same eigenvalue λ 2 .Thus the Moreover, given two sequences {an} n∈N and {bn} n∈N , n ≈ an(1 + bn an + O( bn an )) so the remainder is still 2k − 1 + 1 2 .Arguing again by point i) of Proposition 3 one gets the estimate in the statement.The last part about the ∞−capacity follow just by Proposition 4. If A j ≡ 0 for any j then for any ν ∈ R, ν > 0 we have λnn ν → 0 as n → ±∞.

Proof of Theorem 2
Proof of Theorem 2 The proof of the first part of the statement follows from a couple of elementary considerations.In the sequel we will use the short-hand notation A for Skew(K).

Fact 1: Equation (1) holds if and only if A has finite rank
Suppose that K| V is symmetric.Consider the orthogonal splitting of L 2 [0, 1] as V ⊕ V ⊥ .Equation ( 1) can be reformulated as Conversely, if the range of A is finite dimensional, we can decompose L 2 [0, 1] as Im(A) ⊕ ker(A), where the decomposition is orthogonal by skew-symmetry.Thus, on ker(A), K is symmetric.
Fact 2: A determines the kernel of K It is well known that, if K is Hilbert-Schmidt, then K * is Hilbert-Schmidt too.Since we are assuming eq. ( 2) it is given by: So we can write down the integral kernel A(t, τ ) of A as follows: The key observation now is that the support of the kernel of K is disjoint form the support of the kernel of K * .Thus the kernel of A determines the kernel of K (and vice versa).Now, since we are assuming that A has finite dimensional image, we can present its kernel as: where A 0 is a skew-symmetric matrix and Z t is a dim(Im(A)) × k matrix that has as rows the elements of some orthonormal base of Im(A).Without loss of generality we can assume A 0 = J.In fact with an orthogonal change of coordinates A 0 decomposes as a direct sum of rotation with an amplitude λ i .Rescaling the coordinates by √ λ i yields the desired canonical form J.
The first part of the statement is proved so we pass to second one.First of all notice that, now that we have written down any operator satisfying eqs.( 1) and ( 2) in the same form as those in eq. ( 3), we can apply all the results about the asymptotic of their eigenvalues.In particular, if we assume that the space Im(A) ⊂ L 2 ([0, 1], R k ) is generated by piecewise analytic functions, the ordered sequence of eigenvalues satisfies: Notice that we are using a better estimates on the reminder (for the case of the 1−capacity) then the one given in Theorem 1 that was given in [3].We denote by M † = M * the conjugate transpose.Set 2m = dim(Im(A)), since the map t → Z t is analytic, there exists a piecewise analytic family of unitary matrices G t such that: Without loss of generality we can assume that the function ζ i are analytic on the whole interval and everywhere non negative.Recall that the coefficient ξ appearing in the asymptotic was computed as ξ = is an isometry, thus the eigenvalue of Skew(K) = A remain the same if we consider the similar operator G −1 • A • G which acts as follows: To simplify notation let's forget about this change of coordinates and still call Z t the matrix Z t G t .Write Z t as: We introduce the following notation: for a vector function v i the quantity (v i ) j stands for j−th component of v i .
We can now bound the function ζ(t) in terms of the components of the matrix Z t : Where the vector |v| is the vector with entries the absolute values the entries of v. Integrating and using Hölder inequality for the 2 norm, we get: The next step is to relate the quantity on the right hand side to the eigenvalues of A. The strategy now is to modify the matrix Z t in order to get an orthonormal frame of Im(A).Keeping track of the transformations used we get a matrix representing A, then it is enough to compute the eigenvalues of the said matrix.
We can assume, without loss of generality that x i , x j L 2 = δ ij .This can be achieved with a symplectic change of the matrix Z t .Then we modify the y j in order to make them orthogonal to the space generated by the x j .We use the following transformation: where M is defined by the relation The last step is to make y j orthonormal.If we multiply Y t by a matrix L we find the equation L . Thus the matrix representing A in this coordinates is one half of: If we square A 0 and compute the trace we get: Call Σ(A) the spectrum of A, since A is skew-symmetric it follows that: Recalling that ||x i || = 1 and putting all together we find that: Example 1.Consider a matrix Z t of the following form: The capacity of K is given by ζ = 1 0 |ξ 1 ξ 2 |(t)dt.We can assume that ξ 2 , ξ 3 = 0 and ||ξ 2 || = 1.A direct computation shows that the eigenvalue of SkewK are ±i 2 . This shows that the two quantities behave in a very different way.If we choose ξ 2 very close to ξ 1 and ξ 3 small, capacity and eigenvalue square are comparable.If we choose ξ 3 very big the capacity remains the same whereas the eigenvalues explode.In particular there cannot be any lower bound of ζ in terms of the eigenvalues of K. Remark 4.There is a natural class of translations that preserves the capacity.Take any path Φ t of symplectic matrices (say L 2 integrable), the operators constructed with Z t and Φ t Z t have the same capacity (but the respective skew-symmetric part clearly do not have the same eigenvalues). Set Take for instance the example above and suppose for simplicity that ξ 1 and ξ 2 are positive and never vanishing.Using the following transformation we obtain: and in this case the eigenvalue became ±i 2 ξ 1 , ξ 2 , precisely half the capacity.

The second variation of an optimal control problem
We start this section collecting some basic fact about optimal control problems, first and second variation.Standard references on the topic are [3], [7], [4], [10] and [8].

Symplectic geometry and optimal control problems
Consider a smooth manifold M , its cotangent bundle T * M is a vector bundle on M whose fibre at a point q is the vector space of linear functions on T q M , the tangent space of M at q. Let π be the natural projection, π : T * M → M which takes a covector and gives back the base point: Using the the projection map we define the following 1−form, called tautological (or Liouville ) form: take an element X ∈ T λ (T * M ), s λ (X) = λ(π * X).One can check that σ = ds is not degenerate in local coordinates.We obtain a symplectic manifold considering (T * M, σ).
Using the symplectic form we can associate to any function on T * M a vector field.Suppose that H is a smooth function on T * M , we define H setting: H is called Hamiltonian function and H is an Hamiltonian vector field.
On T * M we have a particular instance of this construction which can be used to lift arbitrary flows on the base manifold M to Hamiltonian flows on T * M .For any vector field V on M consider the following function:

It is straight forward to check in local coordinates that π
The next objects we are going to introduce are Lagrangian subspaces.We say that a subspace W of a symplectic vector space (Σ, σ) is Lagrangian if the restriction of the symplectic form σ is degenerate, i.e. if {v ∈ Σ : σ(v, w) = 0, ∀ w ∈ W } = W .An example of Lagrangian subspaces is the fibre, i.e. the kernel of π * .More generally we can consider the following submanifolds in T * M : where N ⊂ M is a submanifold.A(N ) is called the annihilator of N and its tangent space at any point is a Lagrangian subspace.
Suppose we are given a family of complete and smooth vector fields f u which depend on some parameter u ∈ U ⊂ R k and a Lagrangian, i.e. a smooth function ϕ(u, q) on U × M .We use the vector fields f u to produce a family of curves on M .For any function u ∈ L ∞ ([0, 1], U ) we consider the following non autonomous ODE system on M : The solution are always Lipschitz curves.For fixed q 0 , the set of functions u ∈ L ∞ ([0, 1], U ) for which said curves are defined up to time 1 is an open set which we call U q0 .We can let the base point q 0 vary and consider U = ∪ q0∈M U q0 .It turns out that this set has a structure of a Banach manifold (see [6]).We call the L ∞ functions obtained this way admissible controls and the corresponding trajectories on M admissible curves.
Denote by γ u the admissible curve obtained form an admissible control u.We are interested in the following minimization problem on the space of admissible controls: We often reduce the space of admissible variations imposing additional constraints on the final and initial position of the trajectory.For example one can consider trajectories that start and end at two fixed points q 0 , q 1 ∈ M , or trajectory that start from a submanifold N 0 and reach a second submanifold N 1 .More generally we can ask that the curves satisfy (γ(0), γ( 1) We often consider the following family of functions on T * M : We use them to lift vector fields on M to vector fields on T * M .They are closely relate with the function defined above and still satisfy π * ( h u ) = f u .
There are essentially two possibility for the parameter ν, it can be either 0 or, after appropriate normalization of λ t , −1.The extremals belonging to the first family are called abnormal whereas the ones belonging to second normal.

The Endpoint map and its differentiation
We will consider now in detail the minimization problem in equation eq. ( 22) with fixed endpoints.
As in the previous section we denote by U q0 ⊂ L ∞ ([0, 1], U ) be the space of admissible controls at point q 0 and define the following map: It takes the control u and gives the position at time t of the solution of eq. ( 21) starting from q 0 .We call this map Endpoint map.It turns out that E t is smooth, we are going now to compute its differential and Hessian.The proof of these facts can be found in the book [7] or in [1].
For a fixed control ũ consider the function h ũ(λ) = h ũ(t) (λ) and define the following non autonomous flow which plays the role of parallel transport in this context: It has the following properties: i) It extends to the cotangent bundle the flow which solves q = f t ũ(q) on the base.In particular if λ t is an extremal with initial condition λ 0 , π( Φt (λ 0 )) = q ũ(t) where q ũ is an extremal trajectory.ii) Φt preserves the fibre over each q ∈ M .The restriction Φt : M is an affine transformation.
We suppose now that λ(t) is an extremal and ũ a critical point of the functional J .We use the symplectomorphism Φt to pull back the whole curve λ(t) to the starting point λ 0 .We can express all the first and second order information about the extremal using the following map and its derivatives: Notice that: is an extremal and ũ the relative control.
Thus the first non zero derivatives are the order two ones.We define the following maps: We denote by Π = ker π * the kernel of the differential of the natural projection π : T * M → M .Proposition 5 (Differential of the endpoint map).Consider the endpoint map E t : U q0 → M .Fix a point ũ and consider the symplectomorphism Φt and the map Z t defined above.The differential is the following map: In particular, if we identify T λ0 (T * M ) with R 2m and write Z is a regular point if and only if v t → t 0 X τ v τ dτ is surjective.Equivalently if the following matrix is invertible: If d ũE t is surjective then (E t ) −1 (q t ) is smooth in a neighbourhood of ũ and is tangent space is given by: When the differential of the Endpoint map is surjective a good geometric description of the situation is possible.The set of admissible control becomes smooth (at least locally) and our minimization problem can be interpreted as a constrained optimization problem.We are looking for critical points of J on the submanifold {u ∈ U : E t (u) = q 1 }.Definition 2. We say that a normal extremal λ(t) with associated control ũ(t) is strictly normal if the differential of the endpoint map at ũ is surjective.
It makes sense to go on and consider higher order optimality conditions.At critical points is well defined (i.e.independent of coordinates) the Hessian of J (or the second variation).Using chronological calculus (see again [7] or [1]) it is possible to write the second variation of J on ker Proposition 6 (Second variation).Suppose that (λ(t), ũ) is a strictly normal critical point of J with fixed initial and final point.For any u ∈ L ∞ ([0, 1], R k ) such that 1 0 X t u t dt = 0 the second variation of J has the following expression: The associated bilinear form is symmetric provided that u, v lie in a subspace that projects to a Lagrangian one via the map u → One often makes the assumption, which is customarily called strong Legendre condition, that the matrix H t is strictly negative definite and has uniformly bounded inverse.This guarantees that the term: Vice versa any triple ((Σ, σ), Π, Z) as above determines a couple (K, V).We can define the skew-symmetric part A of K as: A determines the whole operator K and its domain is recovered as V = Z −1 (Π).
Proof The proof is essentially a reformulation of Theorem 2. Given the operator we construct the symplectic space (Σ, σ) taking as vector space the image of the skew-symmetric part Im(A) and as symplectic form A•, • .
The transversality condition correspond to the fact that the differential of the endpoint map is surjective.
The only thing left to show is uniqueness of the triple.Without loss of generality we can assume that the symplectic subspace (Σ, σ) = (R 2n , σ) is the standard one and that the Lagrangian subspace Π is the vertical subspace.In this coordinates Define the following map: To determine uniqueness we have to study an affine equation thus is sufficient to study the kernel of F .Suppose for simplicity that X t and Y t are continuous in t.We have to solve the equation: It follows that F (Y t ) = 0 if and only if the subspace V [0,1] is isotropic.Since we are in finite dimension, we can consider a finite number of instants t i to which we can restrict to generate the whole V [0,1] .Call I the set of this instants.Without loss of generality we can assume that { i∈I X ti ν i , ν i ∈ R k , t i ∈ I} = R n .This is so since the image of Z is transversal to Π and thus Γ = 1 0 X t X * t dt is non degenerate.In fact, if the subspace { l i=1 X ti ν i | ν i ∈ R k , l ∈ N} were a proper subspace of R n , there would be a vector µ such that µ, X t ν = 0, ∀t ∈ [0, 1] and ∀ν ∈ R n .Thus an element of the kernel of Γ.A contradiction.Now we evaluate the equation F (Y t ) = 0 ⇐⇒ Y * t Xτ = X * t Yτ at the instants t = t i that guarantee controllability.One can read off the following identities: Y * t v j = X * t c j where the v ′ j s are a base of R n and c j free parameters.Taking transpose we get that Y t = GX t .
It is straightforward to check that, if Y t = GX t , G must be symmetric, in fact: Z t JZτ = Y * t Xτ − X * t Yτ = X * t (G * − G)Xτ = 0 ⇐⇒ G = G * And so uniqueness is proved when X t and Y t are continuous.
The case in which X t and Y t are just L 2 (matrix-)functions can be dealt with similarly.One has just to replace evaluations with integrals of the form t+ǫ t−ǫ Zτ νdτ and t+ǫ t−ǫ Xτ νdτ and interpret every equality t almost everywhere.The only thing left to show is how to construct a control system with given (K, V) as second variation.By the equivalence stated above it is enough to show that we can realize any given map Z : L 2 ([0, 1], R k ) → Σ with a proper control system.We can assume without loss of generality that (Σ, σ) is just R 2m with the standard symplectic form and Π is the vertical subspace.With this choices the map Z is given by : The operator K is then given by K(v) = t 0 Z * t JZτ vτ dτ and V = {v| 1 0 X t v t dt = 0}.Consider the following linear quadratic system on R m : where B t and Ω t are matrices of size m × k, the Hamiltonian in PMP reads: Take as extremal control u t ≡ 0, it easy to check that the re-parametrization flow Φt defined in eq. ( 23) is just the identity and the matrix Z t for this problem is the following: So it is enough to take Ω t = Y t and B t = X t .
We can reformulate also the second part of Theorem 2 relating the capacity of K and the eigenvalues of A. We make the following assumptions: 1. the map t → Z t is piecewise analytic in t; 2. the maximum condition in the statement of PMP defines a C 2 function Ĥt (λ) = max u∈R k h t u (λ) in a neighbourhood of the strictly normal regular extremal we are considering.
Under the above assumptions the following proposition clarifies the link between the matrices Z t and H t and the function Ĥt .A proof can be found either in [7,Proposition 21.3] or [1].Proposition 7. Suppose that (λ(t), ũ) is an extremal and the function Ĥt is C 2 , using the flow defined in eq.(23) define H t (λ) = ( Ĥt − h ũ(t) ) • Φt (λ).It holds that: Define R t = max v∈R k ,||v||=1 ||Z t v|| and let {±iζ j (t)} l j=1 be the eigenvalues of iZ * t JZ t as defined in Section 3. We have the following proposition.
leaving optimality conditions aside, Theorem 1 gives the asymptotic distribution of the eigenvalues of the second variation for totally singular extremals (see definition 3).As mentioned in the previous section we can produce a second variation also in the non strictly normal case which is at least formally very similar to the normal case.However, a common occurrence is that the matrix H t completely degenerates and is constantly equal to the zero matrix.This is the case for affine control systems and abnormal extremal in Sub-Riemannian geometry, i.e. systems of the form: In this case Legendre condition H t ≤ 0 (see the previous section) does not give much information.One, then, looks for higher order optimality conditions.This is usually done exactly as in Lemma 1: the first optimality conditions one finds are Goh condition and generalized Legendre condition which prevent the second variation from being strongly indefinite.
In the notation of Lemma 1 Goh conditions is written as Q 1 ≡ 0 i.e.Z * t JZ t ≡ 0. It can be reformulated in geometric terms as follows, if λ t is the extremal then λ t [∂ u f u (q(t))v 1 , ∂ u f u (q(t))v 2 ] = 0, ∀ v 1 , v 2 ∈ R k From Theorem 1 it is clear that if Q 1 ≡ 0, the second variation has infinite negative index and that eigenvalues distribute evenly between the negative and positive parts of the spectrum.Then one asks that the second term Q 2 is non positive definite (recall the different sign convention in Proposition 6), otherwise the negative part of the spectrum of −Q 2 becomes infinite.In our notation this condition reads Again it can be translated in a differential condition along the extremal, however this time it will in general involve more than just commutators if the system is not control affine.
If Q 2 ≡ 0, one can take more derivatives and find new conditions.In particular, using the notation of Lemma 1, one has always to ask that the first non zero term in the expansion is of even order and that the matrix of its coefficients is non positive in order to have finite negative index.