1 Introduction

The main focus of this paper is the study of a particular class of compact operators K on the Hilbert space \(L^2([0,1],\mathbb {R}^k)\) with the standard Hilbert structure. They are characterized by the following properties:

  • there exists a finite dimensional subspace of \(L^2([0,1],\mathbb {R}^k)\), which we call \(\mathcal {V}\), on which K becomes a self-adjoint operator, i.e. :

    $$ \langle u, K v \rangle = \langle K u, v\rangle \quad \forall u, v \in \mathcal{V}, $$
    (1)
  • K is an Hilbert-Schmidt operator with an integral kernel of a particular form, namely:

    $$ K(v)(t) = {\int}_0^tV(t,\tau)v(\tau) d\tau, \quad v \in L^2([0,1],\mathbb{R}^k), $$
    (2)

    where V (t,τ) is a matrix whose entries are L2 functions. We call the class of operator satisfying this last condition Volterra-type operators.

The main results of this paper are a fairly general study of the asymptotic distribution of the eigenvalues of K when restricted to any subspace \(\mathcal {V}\) which satisfies (1) (Theorem 1) and a characterization result for operators satisfying the two properties stated above (Theorem 2).

The first result is proved in Section 3. We first restrict ourself to operators \(\tilde {K}\) of the form:

$$ \tilde K(v)(t) = -{\int}_0^t\sigma (Z_{\tau} v_{\tau},Z_t \cdot) d\tau. $$
(3)

Here, Zt is an analytic in t, 2n × k matrix and σ the standard symplectic form on \(\mathbb {R}^{2n}\) (see Remark 1). A similar asymptotic formula was proved in [3, Theorem 1], it was shown that if we consider \(\{\lambda _n(\tilde {K})\}_{n \in \mathbb {Z}}\) the decreasing (resp. increasing) arrangement of positive (resp. negative) eigenvalues of \(\tilde K\) we have either:

$$ \lambda_n(\tilde{K}) = \frac{\xi}{\pi n} + O(n^{-5/3}) \quad \text{ or } \quad \lambda_n(\tilde K) = O(n^{-2}), $$
(4)

for \(n \in \mathbb {Z}\) sufficiently large and for some ξ > 0. The number ξ is called capacity and depends only on the matrix Zt in the definition of \(\tilde {K}\).

If ξ = 0, we go further with the expansion in (4). We single out the term giving the principal contribution to the asymptotic representing the quadratic form associated to \(\tilde K\) as:

$$ Q(v) = \langle v,\tilde K v\rangle = -{\int}_0^1 {\int}_0^t \sigma(Z_{\tau} v_{\tau} ,Z_tu_t ) d \tau dt = {\sum}_{i=1}^{k-1} Q_i(v)+R_k(v). $$

The result mentioned above corresponds to the case Q1≠ 0; in Theorem, 1 we give the asymptotic for the general case.

From the point of view of geometric control theory, Theorem 1 can be seen as an asymptotic analysis of the spectrum of the second variation for particular classes of singular extremals and a quantitative version of some necessary optimality conditions.

Precise definitions will be given in Section 5. Standard references on the second variation are [7, Chapter 20] and [1]. For now, it is enough to know that the second variation Q of an optimal control problem on a manifold M is a linear operator on \(L^2([0,1],\mathbb {R}^k)\) of the following form:

$$ \langle Q v, u \rangle = -{\int}_0^1 \langle H_t v_t,u_t \rangle -{\int}_0^1 {\int}_0^t \sigma(Z_{\tau} v_{\tau} ,Z_tu_t ) d \tau dt, $$
(5)

where Ht is a symmetric k × k matrix, σ is the standard symplectic form on TηTM and \(Z_t: \mathbb {R}^k \to T_{\eta }(T^{*}M)\) is a linear map with values in the tangent space to a fixed point ηTM.

For totally singular extremal, the matrix Ht appearing in (5) is identically zero and the second variation reduces to an operator of the same form as in (3).

In Section 4, we prove Theorem 2. We first show that any K satisfying (1) and (2) it is completely determined by its (finite rank) skew-symmetric part \(\mathcal {A}\) and can always be represented as in (3). Then we relate the capacity of K to the spectrum of \(\mathcal {A}\).

In Section 5, we recall some basic notions from control theory and we reformulate Theorem 2 in a more control theoretic fashion, and use it to characterize the operators coming form the second variation of an optimal control problem. Moreover, we give a geometric interpretation of the capacity ξ appearing in (4) in terms of the Hessian of the maximized Hamiltonian coming from Pontryagin Maximum Principle.

2 Overview of the Main Results

We begin this section recalling some general facts about the spectrum of compact operators, then we fix some notation and give a precise statement of the main results. Given a compact self-adjoint operator K on an Hilbert space \({\mathscr{H}}\), we can define a quadratic form setting Q(v) = 〈v,K(v)〉. The eigenvalues of Q are by definition those of K and we will denote Σ±(Q) the positive and negative parts of the spectrum of Q.

By the standard spectral theory of compact operators (see [12]), the non-zero eigenvalues of K are either finite or accumulate at zero and their multiplicity is finite. Consider the positive part of the spectrum of Q, Σ+(Q) and λ ∈Σ+(Q). Denote by mλ the multiplicity of the eigenvalue λ. We can introduce a monotone non-increasing sequence \(\{\lambda _{n}\}_{n \in \mathbb {N}}\) indexing the eigenvalues of K, requiring that the cardinality of the set {λn = λ} = mλ for every λ ∈Σ+(Q).

This will be called the monotone arrangement of Σ+(Q). We can perform the same construction indexing by − n, \(n \in \mathbb {N}\), the negative part of the spectrum Σ(Q). This time we require that the sequence \(\{\lambda _{-n}\}_{n \in \mathbb {N}}\) is non-decreasing. Provided that Σ±(Q) are both infinite, we obtain a sequence \(\{\lambda _{n}\}_{n \in \mathbb {Z}}\).

Definition 1

Let Q be a quadratic form Q on a Hilbert space \({\mathscr{H}}\) and \(j \in \mathbb {N}\)

  • if j is odd, Q has j −capacity ξ > 0 with reminder of order ν > 0 if Σ+(Q) and Σ(Q) are both infinite and:

    $$ \lambda_{n} = \frac{\xi}{(\pi n)^{j}} + O(n^{-\nu-j}) \quad \text{ as }\quad n \to \pm \infty, $$
  • if j is even, Q has j −capacity (ξ+,ξ) of order ν > 0 if both Σ+(Q) and Σ(Q) are infinite and:

    $$ \begin{array}{l} \lambda_{n} = \frac{\xi_{+}}{(\pi n)^{j}} + O(n^{-\nu-j}) \quad \text{ as }\quad n \to + \infty,\\ \lambda_{n} = \frac{\xi_{-}}{(\pi n)^{j}} + O(n^{-\nu-j}) \quad \text{ as }\quad n \to - \infty, \end{array} $$

    where ξ±≥ 0 or if at least one between Σ+(Q) and Σ(Q) is infinite and the relative monotone arrangement satisfies the corresponding asymptotic relation;

  • if the spectrum is finite or λn = O(nν) as \(n \to \pm \infty \) for any ν > 0, we say that Q has \(\infty -\)capacity.

The behaviour of the sequence \(\{\lambda _{n}\}_{n \in \mathbb {Z}}\) is closely related to the following counting functions:

$$ C^{+}_{j}(n) =\# \{l \in \mathbb{N} : 0< \frac{1}{\sqrt[j]{\lambda_{l}}}<n\} \quad C^{-}_{j}(n) =\# \{l \in \mathbb{N} : -n> \frac{-1}{\sqrt[j]{\vert\lambda_{-l}\vert}}>0\} $$

The requirement of Definition 1 for the j −capacity can be translated into the following asymptotic for the functions \(C^{\pm }_{j}(n)\):

$$ C^{\pm}_{j}(n) = \frac{\xi_{\pm}}{\pi} n + O(n^{1-\nu}) \quad \text{ as } \quad n \to \pm \infty $$

We illustrate here some of the properties of the j −capacity. The proofs are given in Section 3, Proposition 3. Without loss of generality we state the properties for the positive part of the spectrum, analogue results hold for the negative one.

  • (Homogeneity) if Q1 and Q2 are quadratic forms on two Hilbert spaces \({\mathscr{H}}_{1}\) and \({\mathscr{H}}_{2}\) of j −capacity ξ1 and ξ2 respectively with the same remainder ν, then aQ1 has j −capacity aξ1 and the sum Q1Q2 on \({\mathscr{H}}_{1}\oplus {\mathscr{H}}_{2}\) has j −capacity \(\left (\sqrt [j]{\xi _{1}}+\sqrt [j]{\xi _{2}}\right )^{j}\) both with remainder ν.

  • (Independence of restriction) If \(\mathcal {V} \subseteq {\mathscr{H}}\) is a subspace of finite codimension then Q has j −capacity ξ with remainder ν if and only if its restriction to \(\mathcal {V}\) has j −capacity ξ with remainder ν.

  • (Additivity) if Q1 has j −capacity ξ with remainder ν and Q2 has 0 j −capacity with remainder of the same order ν, then their sum Q1 + Q2 has the same capacity with remainder \(\nu ^{\prime } = \frac {(j+\nu )(j+1)}{j+\nu +1} \)

The remaining part of this section will be dealing with quadratic forms Q coming from operators of the form given in (3). Suppose that Zt is a 2n × k matrix which depends piecewise analytically on the parameter t ∈ [0,1] and define the following 2n × 2n skew-symmetric matrix:

$$ J = \left( \begin{array}{cc} 0 &-Id_{n} \\ Id_{n} &0 \end{array}\right). $$
(6)

As Q consider the following quadratic form on \(L^{2}([0,1],\mathbb {R}^{k})\):

$$ Q(v) = \langle v, K(v) \rangle = {{\int}_{0}^{1}} {{\int}_{0}^{t}} \langle Z_{t} v(t), J Z_{\tau} v(\tau)\rangle d\tau dt. $$
(7)

Remark 1

The operator K and the bilinear form Q(u,v) = 〈u,K(v)〉 are not symmetric. However, the operator

$$K(v) ={{\int}_{0}^{t}} Z_{t}^{*}JZ_{\tau} v(\tau)d\tau, $$

satisfies (1) and becomes symmetric on a finite codimension subspace \(\mathcal {V}\). It is enough to require that the integral \({{\int \limits }_{0}^{1}} Z_{t} v(t) dt\) lies in a Lagrangian subspace of \((\mathbb {R}^{2n},\sigma )\) for any \(v \in \mathcal {V}\). For instance, if we consider the fibre (or vertical subspace), i.e. the following:

$$ {\Pi} = \{(p,0) : p \in \mathbb{R}^{n}\} \subset \mathbb{R}^{2n}. $$
(8)

Here, σ denotes the standard symplectic form on \(\mathbb R^{2n}\) defined as \(\sigma (x,x^{\prime }) = \langle J x,x^{\prime }\rangle .\)

Let f be a smooth function on [0,1] and let \(k \in \mathbb {N}\), denote by \(f^{(k)} = \frac {d^{k} f}{d t^{k}}\) the k −th derivative with respect to t. For j ≥ 1 define the following matrix valued functions:

$$ A_{j}(t)= \left\{\begin{array}{ll} \left( Z_{t}^{(k)}\right)^{*}J Z^{(k)}_{t} \quad &\text{if }j = 2k-1\\ \left( Z_{t}^{(k-1)}\right)^{*}J Z^{(k)}_{t} \quad &\text{if }j = 2k \end{array}\right. $$
(9)

We use ρt to denote any eigenvalue of the matrix Aj(t). If j = 2k, define:

$$ \mu_{t,2k}^{+} :={\sum}_{\rho_{t} : \rho_{t}>0} \sqrt[2k]{ \rho_{t} } \qquad \mu_{t,2k}^{-} :={\sum}_{\rho_{t}:\rho_{t}<0} \sqrt[2k]{\vert \rho_{t}\vert }. $$

For odd indices, A2k− 1 is skew-symmetric and thus the spectrum is purely imaginary. So we define the function:

$$ \mu_{t,2k-1} = {\sum}_{\rho_{t} : -i\rho_{t}>0} \sqrt[2k-1]{-i\rho_{t}}. $$

We are now ready to state the first main result of the section.

Theorem 1

Let Q be the quadratic form in (7). Q has either \(\infty -\)capacity or j −capacity with remainder of order ν = 1/2. More precisely, let j ≥ 1 be the lowest integer such that Aj(t) is not identically zero, then

  • if j = 2k − 1, the (2k − 1) −capacity ξ is given by:

    $$ \xi = \left( {{\int}_{0}^{1}}\mu_{t,2k-1} dt \right)^{2k-1}, $$

    and thus for \(n\in \mathbb {Z}\) sufficiently large:

    $$ \lambda_{n} = \frac{\left( {{\int}_{0}^{1}}\mu_{t,2k-1} dt \right)^{2k-1}}{(\pi n)^{2k-1}} + O(n^{-2k+1/2}). $$
  • if j = 2k, the 2k −capacity (ξ+,ξ) is given by:

    $$ \xi_{\pm} = \left( {{\int}_{0}^{1}}\mu^{\pm}_{t,2k} dt \right)^{2k}, $$

    and thus for \(n\in \mathbb {Z}\) sufficiently large:

    $$ \lambda_{n} = \frac{\left( {{\int}_{0}^{1}}\mu^{\pm}_{t,2k} dt \right)^{2k}}{(\pi n)^{2k}} + O(n^{-2k-1/2}). $$
  • if Aj(t) ≡ 0 for any j then Q has \(\infty -\)capacity.

Remark 2

It is worth remarking that in Theorem 1 of [3] the order of the remainder for the 1 −capacity was a little better, 2/3 and not 1/2.

The proof of this result is given in Section 3. The next theorem gives a characterization of the operators satisfying (1) and (2) and a geometric interpretation of the 1 −capacity. Before going to the statement let us introduce the following notation. Let \(\mathcal {A}\) denote the skew-symmetric part of K:

$$ \mathcal{A} = \frac{1}{2}\left( K-K^{*}\right). $$

Let Σ be the spectrum of \(\mathcal {A}\) and \(\text {Im}(\mathcal {A})\), the image of \(\mathcal {A}\).

Theorem 2

Let be K an operator satisfying (1) and (2). Then, \(\mathcal {A}\) has finite rank and completely determines K. More precisely, if \(\mathcal {A}\) has rank 2m and is represented as:

$$ \mathcal{A}(v)(t):= \frac{1}{2} Z_{t}^{*}\mathcal{A}_{0}{{\int}_{0}^{1}} Z_{\tau} v(\tau) dt, $$

for a skew-symmetric 2m × 2m matrix \(\mathcal {A}_{0}\) and a 2m × k matrix Zt then:

$$ K(v)(t) = {{\int}_{0}^{t}}Z_{t}^{*}\mathcal{A}_{0}Z_{\tau} v(\tau) d\tau . $$
(10)

Let Σ be the spectrum of \(\mathcal {A}\), if the matrix Zt can be chosen to be piecewise analytic the 1 −capacity of K can be bound by

$$ \xi \le 2\sqrt{m}\sqrt{{\sum}_{\rho \in {\Sigma} : -i \rho >0 }-\rho^{2}} \le2 \sqrt{m}{\sum}_{\rho \in {\Sigma} : -i \rho >0 } \vert\rho \vert. $$

3 Proof of Theorem 1

Before going to the proof of Theorem 1 we still need some auxiliary results. We start with Lemma 1 to single out the main contributions to the asymptotic of the eigenvalues of Q (the quadratic form defined in (7)). The first non-zero term of the decomposition we give will determine the rate of decaying of the eigenvalues (see Proposition 4).

Before showing this and prove the precise estimates, we need to carry out the explicit computation of the asymptotic in some model cases, namely when the matrices Aj are constant. Then, we have to show how the j −capacity behaves with respect to natural operations such as direct sum of quadratic form or restriction to finite codimension subspaces (Proposition 3).

Let us start with some notation:

$$ v_{k}(t) = {{\int}_{0}^{t}} v_{k-1}(\tau) d\tau, \quad v_{0}(t) = v(t) \in L^{2}([0,1],\mathbb{R}^{m}) $$

Suppose that the map tZt is real analytic (or at least regular enough to perform the necessary derivatives) and integrate by parts twice:

$$ \begin{array}{@{}rcl@{}} Q (v)&=& {{\int}_{0}^{1}} \langle Z_{t} v(t),{{\int}_{0}^{t}} JZ_{\tau} v(\tau) d \tau \rangle dt \\ &=& {{\int}_{0}^{1}} \langle Z_{t} v(t),JZ_{t} v_{1}(t)\rangle - \langle Z_{t} v(t),{{\int}_{0}^{t}}J\dot{Z}_{\tau} v_{1}(\tau) d \tau \rangle dt \\ &=& {{\int}_{0}^{1}} \langle Z_{t} v(t),JZ_{t} v_{1}(t)\rangle +\langle Z_{t} v_{1}(t),J\dot{Z}_{t} v_{1}(t)\rangle dt +\\ &&\ +{{\int}_{0}^{1}} \langle \dot{Z}_{t} v_{1}(t),J{{\int}_{0}^{t}}\dot{Z}_{\tau} v_{1}(\tau) d\tau\rangle dt - \left[ \langle {{\int}_{0}^{1}} Z_{t} v(t)dt,J{{\int}_{0}^{1}} \dot{Z}_{t}v_{1}(t)dt \rangle\right] \end{array} $$

If we impose the condition \({{\int \limits }_{0}^{1}} v_{t} dt =0 (\iff v_{1}(1)=0)\), the term in brackets vanishes:

$$ \langle {{\int}_{0}^{1}} Z_{t} v(t)dt,J{{\int}_{0}^{1}} \dot{Z}_{t}v_{1}(t)dt \rangle = \langle {{\int}_{0}^{1}} Z_{t} v(t)dt,J Z_{1}v_{1}(1) \rangle - \langle {{\int}_{0}^{1}} Z_{t} v(t)dt,J{{\int}_{0}^{1}} Z_{t}v(t)dt \rangle $$

and we can write Q as a sum of three terms

$$ Q(v) = Q_{1}(v) + Q_{2}(v)+ R_{1}(v) $$

In analogy, we can make the following definitions:

$$ \begin{array}{@{}rcl@{}} Q_{2k-1}(v) &=& {{\int}_{0}^{1}} \langle Z^{(k-1)}_{t} v_{k-1}(t),JZ^{(k-1)}_{t} v_{k}(t)\rangle = {{\int}_{0}^{1}} \langle v_{k-1}(t),A_{2k-1}(t) v_{k}(t)\rangle\\ Q_{2k}(v) &=& {{\int}_{0}^{1}}\langle Z^{(k-1)}_{t} v_{k}(t),JZ^{(k)}_{t} v_{k}(t)\rangle dt = {{\int}_{0}^{1}}\langle v_{k}(t),A_{2k}(t) v_{k}(t)\rangle dt \\ R_{k} &=& {{\int}_{0}^{1}} \langle Z^{(k)}_{t} v_{k}(t),J{{\int}_{0}^{t}}{Z}^{(k)}_{\tau} v_{k}(\tau) d\tau\rangle dt \\ V_{k} &= &\{v \in L^{2}([0,1],\mathbb{R}^{m}) : v_{l}(1)=0, \forall 0<l \le k\} \end{array} $$

Here, the matrices Aj(t) are exactly those defined in (9).

Lemma 1

For every \(j \in \mathbb {N}\), on the subspace Vj, the form Q can be represented as

$$ Q(v) = {\sum}_{k=1}^{2j} Q_{k}(v) + R_{j}(v) $$
(11)

The matrices A2k(t) are symmetric provided that \(\frac {d}{dt} A_{2k-1}(t)\equiv 0\). On the other hand A2k− 1 is always skew symmetric.

Proof

It is sufficient to notice that R1(v) has the same form as Q(v) but with v1 instead of v and \(\dot {Z}_{t}\) instead of Zt. Thus, the same scheme of integration by parts gives the decomposition.

Notice that \(A_{2k}(t) = A^{*}_{2k}(t)+\frac {d}{dt} A_{2k-1}(t)\); thus, the skew-symmetric part of A2k(t) is zero if A2k− 1 is zero or constant. A2k− 1(t) is always skew-symmetric by definition. □

Now, we would like to compute explicitly the spectrum of the Qj when the matrices Aj are constant. Unfortunately, describing the spectrum with boundary conditions given by the Vj is quite hard. Already for Q4 the equation determining it cannot be solved explicitly.

We will derive the Euler-Lagrange equation for Qj and turn instead to periodic boundary conditions for which everything becomes very explicit and show how to relate the solution for the two boundary value problems we are considering. Let us write down the Euler-Lagrange equations for the forms Qj. If j = 2k integration by parts yields:

$$ \begin{array}{@{}rcl@{}} Q_{2k}(v)- \lambda \vert\vert v\vert\vert^{2}&=& {{\int}_{0}^{1}} \langle v_{k}(t), A_{2k} v_{k}(t)\rangle- \lambda \langle v_{0}(t),v_{0}(t) \rangle dt \\ &=& {{\int}_{0}^{1}} \langle v_{0}(t),(-1)^{k}A_{2k}v_{2k}(t)-\lambda v_{0}(t)\rangle dt+ \\&& \quad +{\sum}_{r=0}^{k-1}(-1)^{r} \left[\langle v_{k-r}(t),A_{2k}v_{k+r+1}(t)\rangle\right]_{0}^{1} \end{array} $$

Notice that the boundary terms vanish identically if we impose the vanishing of vj for 1 ≤ jk at boundary points.

We change notation and define w(t) = v2k(t) and \(w^{(j)} (t) = \frac {d^{j}}{dt^{j}}(w(t))\). The new equations are:

$$ w^{(2k)}(t) = \frac{(-1)^{k}}{\lambda} A_{2k} w(t) $$

We can perform a linear change of coordinates that diagonalizes A2k to reduce to m1 −dimensional systems. Imposing periodic boundary conditions, we are thus left with the following boundary value problem:

$$ w^{(2k)}(t) = \frac{(-1)^{k}\mu}{\lambda} w(t) \quad w^{(j)}(0)=w^{(j)}(1) \text{ for } 0\le j\le 2k-1 $$
(12)

The case of odd j is very similar, in fact Q2k− 1(v) can be rewritten as:

$$ \begin{array}{@{}rcl@{}} Q_{2k-1}(v)-\lambda \vert\vert v\vert\vert^{2} &= & {{\int}_{0}^{1}} \langle v_{k-1}(t), A_{2k-1} v_{k}(t)\rangle- \lambda \langle v_{0}(t),v_{0}(t) \rangle dt \\ &=& {{\int}_{0}^{1}} \langle v_{0}(t), (-1)^{k-1}A_{2k-1}v_{2k-1}(t)-\lambda v_{0}\rangle dt + \textit{b.t.} \end{array} $$

Here, by b.t. we mean boundary terms as the one appearing in the previous equation. They again disappear if we assume that vjVj. Thus, we end up with a boundary value problem similar to the one we had before with the difference that now the matrix A2k− 1 is skew-symmetric.

$$ w^{(2k-1)}(t) = \frac{(-1)^{k-1}}{\lambda} A_{2k-1} w(t) $$

If we split the space into the kernel and invariant subspaces on which A2k− 1 is non-degenerate, we can decompose Q2k− 1 as a direct sum of two-dimensional forms. Imposing periodic boundary conditions, we end up with the following boundary value problems:

$$ \left\{\begin{array}{ll} w_{1}^{(2k-1)}(t) &=- \frac{(-1)^{(k-1)}\mu}{\lambda}w_{2}\\ w_{2}^{(2k-1)}(t) &= \frac{(-1)^{(k-1)}\mu}{\lambda} w_{1} \end{array}\right. \quad \left\{\begin{array}{l} w_{1}^{(j)}(0) = w_{1}^{(j)}(1), \\ w_{2}^{(j)}(0) = w_{2}^{(j)}(1) \end{array}\right. \text{ for } 0\le j\le 2k-2. $$
(13)

Lemma 2

The boundary value problem in (12) has a solution if and only if

$$ \lambda \in \left\{\frac{\mu}{(2\pi r)^{2k}}: r \in \mathbb{N}\right\}. $$

Moreover, any such λ has multiplicity 2. In particular, the decreasing sequence of λ for which (12) has solutions satisfies:

$$ \lambda_{r} = \frac{\mu}{(2\pi \lceil r/2 \rceil)^{2k}} = \frac{\mu}{(\pi r)^{2k}} + O(r^{-(2k+1)}), \quad r \in \mathbb{N} $$

Similarly, the boundary value problem in (13) has a solution if and only if:

$$ \lambda \in \left\{\frac{\vert\mu \vert}{(2\pi r)^{2k-1}}: r \in \mathbb{Z}\right\} $$

and any such λ has again multiplicity 2. The monotone rearrangement of λ for which there exists a solution to the boundary value problem is:

$$ \lambda_{r} = \frac{\vert\mu\vert }{(2\pi \lceil r/2 \rceil)^{2k-1}} = \frac{\vert \mu \vert}{(\pi r)^{2k-1}} + O\left( r^{-(2k)}\right), \quad r \in \mathbb{Z} $$

Proof

Any solution of the equation \(w^{(2k)}(t) = \frac {(-1)^{k}\mu }{\lambda } w(t)\) can be expressed as a combination of trigonometric and hyperbolic functions with the appropriate frequencies.

Without loss of generality we can assume μ > 0, we have to consider two separate cases:

  1. Case 1:

    k even and λ > 0 or k odd and λ < 0

In this case, the quantity (− 1)kμλ− 1 > 0. If we define a2k = (− 1)kμλ− 1 > 0 for a > 0, we have to solve:

$$ w^{(2k)} (t)= a^{2k} w(t), \qquad w^{(j)}(0)=w^{(j)}(1), 0\le j<2k. $$
(14)

A base for the space of solutions to the ODE is then \(\{e^{\omega ^{j} a t}: \omega = e^{i\pi /k}\}\). For us, it will be more convenient to switch to a real representation of the space of solutions. Notice the following symmetry of the even roots of 1, if η is a root of 1 different form ± 1,±i then \(\{\eta , \bar \eta ,-\eta , -\bar \eta \}\) are still distinct roots of 1 (this is also a Hamiltonian feature of the problem).

If we write η = η1 + iη2, this symmetry implies that the space generated by \(\{e^{\eta t }, e^{\bar \eta t },e^{-\eta t },e^{-\bar \eta t}\}\) is the same as the space generated by

$$ \{\sin(\eta_{2} t)\sinh(\eta_{1} t),\sin(\eta_{2} t)\cosh(\eta_{1} t),\cos(\eta_{2} t)\sinh(\eta_{1} t),\cos(\eta_{2} t)\cosh(\eta_{1} t)\}.$$

Let us rescale these functions by a (so that they solve (14)) and call their linear span Uη, we then define U1 to be the span of \(\{\sinh (t),\cosh (t)\}\) and \(U_{i} = \{\sin \limits (t),\cos \limits (t)\}\). Note that Ui appears if and only if k is even.

Thus, the solution space for our problem is the space \(\bigoplus _{\eta } U_{\eta }\) where η ranges over the set E = {η : R(η) ≥ 0,I(η) ≥ 0,η2k = 1}.

Now, we have to impose the boundary conditions. Notice that, if k is even then Ui is made of periodic functions, so they are always solutions. We can look for more on the complement \(\bigoplus _{\eta \ne i} U_{\eta }\). Suppose by contradiction that w is one of such solutions. Write \(w = {\sum }_{\eta } w_{\eta }\) with wηUη and let b be the \(\sup \{\Re (\eta ) : \eta \in E, w_{\eta } \ne 0\}\). It follows that either \(\sinh (b a t)\) or \(\cosh (b a t)\) is present in the decomposition of w. It follows that:

$$w(t) = \sinh(b a t) \frac{w(t)}{\sinh(b a t )} = \sinh(b a t) g(t), \quad 0 \not \equiv \vert g(t) \vert < C \text{ for \textit{t} large enough}$$

and so |w| is unbounded as \(t \to + \infty \) (or \(-\infty \)) and thus w is not periodic. It follows that there are periodic solutions only if k is even (and thus λ > 0) and \(a = 2\pi r = \sqrt [2k]{\frac {\mu }{\lambda }}\). Notice that we have two independent solutions, so if we arrange the solution in a decreasing order, we have:

$$ \lambda_{r} = \frac{\mu}{(2\pi \lceil r/2 \rceil)^{2k}}, \quad r \in \mathbb{N} $$
  1. Case 2:

    k odd and λ > 0 or k even and λ < 0

In this case, we have to look at the roots of − 1 but the argument is very similar. If k is even there are no solutions, since you lack purely imaginary frequencies. If k is odd, set |μλ− 1| = a2k, then the boundary value problem is:

$$ w^{(2k)} (t)=- a^{2k} w(t) \qquad w^{(j)}(0)=w^{(j)}(1), 0\le j<2k. $$

The roots of − 1 are just the roots of 1 rotated by i. Now, the space of solutions is \(\bigoplus _{\eta \ne 1} U_{\eta } \). We find again two independent solutions; if we arrange them in order, we get:

$$ \lambda_{r} = \frac{\mu}{(2\pi \lceil r/2 \rceil)^{2k}}, \quad r \in \mathbb{N} $$

Notice that positive μ gives rise to positive solutions. Thus, if we consider μ < 0, we get the same result but with switched signs.

We can reduce the odd case (13) to the even one. Consider the 1 −dimensional equation of twice the order, i.e.:

$$ w_{1}^{2(2k-1)}(t) =- \frac{\mu^{2}}{\lambda^{2}} w_{1} $$

Now, the discussion above tells us that there are exactly two independent solutions with periodic boundary conditions whenever λ satisfies \(\sqrt [2k-1]{\frac {\mu }{\vert \lambda \vert }} = 2 r \pi \). It follows that again there are two independent solutions, this times for both signs of λ. If we arrange them in order, we get:

$$ \lambda_{r} = \frac{\mu}{(2 \pi \lceil r/2 \rceil)^{2k-1}}, \quad \lambda_{-r} = \frac{\mu}{(2 \pi \lfloor r/2 \rfloor)^{2k-1}}, \quad r \in \mathbb{N} $$

Proposition 1

Let μ > 0 and \(s \in (0,+\infty )\), denote by ηs the number of solutions of (12) with λ greater than s and similarly denote by ωs be the number of solutions with λ bigger than s of:

$$ w^{(2k)}(t) = \frac{(-1)^{k}\mu}{\lambda} w(t), \quad w^{(j)}(0) = w^{(j)}(1)=0,\quad k\le j\le 2k-1 $$
(15)

Then, |ωsηs|≤ 2k. The same conclusion holds for (13).

Proof

The result follows from standard results about Maslov index of a path in the Lagrange Grassmannian. References on the topic can be found in [2, 5, 6]. Let us illustrate briefly the construction. Let (Σ,σ) be a symplectic space, the Lagrange Grassmannian is the collection of Lagrangian subspaces of Σ and it has a structure of smooth manifold. For any Lagrangian subspace L0, we define the train of L0 to be the set: \(T_{L_{0}}=\{L \text { Lagrangian}: L \cap L_{0} \ne (0)\}\). \(T_{L_{0}}\) is a stratified set; the biggest stratum has codimension 1 and is endowed with a co-orientation. If γ is a smooth curve with values in the Lagrangian Grassmannian (i.e. a smooth family of Lagrangian subspaces) which intersects transversally \(T_{L_{0}}\) in its smooth part, one defines an intersection number by counting the intersection points weighted with a plus or minus sign depending on the co-orientation. Tangent vectors at a point L of the Lagrange Grassmannian (which is a subspace of Σ) are naturally interpreted as quadratic forms on L. We say that a curve is monotone if at any point its velocity is either a non-negative or a non-positive quadratic form. For monotone curves, Maslov index counts the number of intersections with the train up to sign. For generic continuous curves, it is defined via a homotopy argument.

Denote by \(\text {Mi}_{L_{0}}(\gamma )\) the Maslov index of a curve γ and L1 be another Lagrangian subspace. In [2], the following inequality is proved:

$$ \vert \text{Mi}_{L_{0}}(\gamma)-\text{Mi}_{L_{1}}(\gamma)\vert \le \frac{\dim({\Sigma})}{2} $$
(16)

Let us apply this results to our problem. First of all let us produce a curve in the Lagrange Grassmannian whose Maslov index coincides with the counting functions ωs and ηs. The right candidate is the graph of the fundamental solution of \(w^{(2k)}(t) = \frac {(-1)^{k}\mu }{\lambda } w(t)\).

We write down a first order system on \(\mathbb {R}^{2k}\) equivalent to our boundary value problem, if we call the coordinates on \(\mathbb {R}^{2k}\) xj, set:

$$ x_{j+1} (t)= w^{(j)}(t) \Rightarrow \dot{x}_{j} = x_{j+1} \text{ for }1\le j\le 2k-1, \quad \dot x_{2k} = \frac{(-1)^{k} \mu}{\lambda}x_{1}. $$

For simplicity call \(\frac {(-1)^{k} \mu }{\lambda } = a\), the matrix we obtain has the following structure:

$$ A_{\lambda}= \left( \begin{array}{cccc} 0 & & & a\\ 1 &0 & & \\ &{\ddots} &\ddots& \\ & &1 &0 \end{array}\right) $$

This matrix is not Hamiltonian with respect to the standard symplectic form on \(\mathbb {R}^{2k}\) but is straightforward to compute a similarity transformation that sends it to an Hamiltonian one (recall that we already used that Aλ has the spectrum of an Hamiltonian matrix). Moreover, the change of coordinates can be chosen to be block diagonal and thus preserves the subspace B = {xj = 0,kj}, which remains Lagrangian too. Since later on we will have to show that the curve we consider is monotone, we will give this change of coordinates explicitly. Define the matrix S setting Si,ki+ 1 = (− 1)i− 1 and zero otherwise. It is a matrix that has alternating ± 1 on the anti-diagonal. Define the following 2k × 2k matrices:

$$ G = \left( \begin{array}{cc} 1 &0\\0 &S \end{array}\right) \quad G^{-1} = \left( \begin{array}{cc} 1 &0 \\ 0 & (-1)^{k}S \end{array}\right) \quad \hat{A}_{\lambda} = G A_{\lambda} G^{-1} $$

Set N to be the lower triangular k × k shift matrix (i.e. the left upper block of Aλ above) and E the matrix with just a 1 in position (1,k) (i.e. the left lower block of Aλ). The new matrix of coefficients is:

$$ \hat A_{\lambda} =\left( \begin{array}{cc} N &a(-1)^{k} ES \\ SE &-N^{*} \end{array}\right) \quad ES = \text{diag}(0,\dots,0,1), \quad SE = \text{diag}(1,0,\dots,0). $$

Now, we are ready to define our curve. First of all, the symplectic space we are going to use is \((\mathbb {R}^{4k}, \sigma \oplus (- \sigma ))\) where σ is the standard symplectic form, in this way graphs of symplectic transformation are Lagrangian subspaces. Sometimes, we will denote the direct sum of the two symplectic forms with opposite signs with σσ too. Let Φλ be the fundamental solution of \(\dot {\Phi }_{\lambda }^{t} = \hat A_{\lambda } {\Phi }_{\lambda }^{t}\) at time t = 1. Consider its graph:

$$ \gamma: \lambda \mapsto {\Gamma}({\Phi}^{1}_{\lambda})= {\Gamma}({\Phi}_{\lambda}), \quad \lambda \in (0,+\infty) $$

Once we prove that γ is monotone, it is straightforward to check that \(\text {Mi}_{B \times B} (\gamma \vert _{[s,+\infty )})\) counts the number of solutions to boundary value problem given in (15) for λs and similarly \(\text {Mi}_{\Gamma (I)} (\gamma \vert _{[s,+\infty )})\) counts the solutions of (12) for λs. Here, Γ(I) stands for the graph of the identity map (i.e. the diagonal subspace).

Let us check that the curve is monotone. As already mentioned, tangent vectors in the Lagrange Grassmannian can be interpreted as quadratic forms. Being monotone means that the following quadratic form is either non-negative or non-positive:

$$ \left( \partial_{\lambda}\gamma\right)(\xi)= \sigma ({\Phi}_{\lambda} \xi,\partial_{\lambda} {\Phi}_{\lambda} \xi), \quad \xi \in \mathbb{R}^{2k} $$

We use the ODE for Φλ(t) to prove monotonicity:

$$ \begin{array}{@{}rcl@{}} \sigma ({\Phi}_{\lambda} \xi,\partial_{\lambda} {\Phi}_{\lambda} \xi) &=&{{\int}_{0}^{1}} \frac{d}{dt}\left( \sigma ({\Phi}^{t}_{\lambda} \xi,\partial_{\lambda} {\Phi}^{t}_{\lambda} \xi)\right)dt + \sigma ({\Phi}^{0}_{\lambda} \xi,\partial_{\lambda} {\Phi}^{0}_{\lambda} \xi)\\ &=&{{\int}_{0}^{1}}\sigma(\hat A_{\lambda} {\Phi}^{t}_{\lambda}\xi,\partial_{\lambda} {\Phi}^{t}_{\lambda} \xi) + \sigma({\Phi}^{t}_{\lambda}\xi,\left( \partial_{\lambda}\hat A_{\lambda} {\Phi}^{t}_{\lambda} +\hat{A}_{\lambda} \partial_{\lambda} {\Phi}^{t}_{\lambda} \right)\xi)dt \\ &=& {{\int}_{0}^{1}}\sigma({\Phi}^{t}_{\lambda}\xi,\partial_{\lambda}\hat A_{\lambda} {\Phi}^{t}_{\lambda} \xi)dt \end{array} $$

where we used the facts that \(\partial _{\lambda } {\Phi }^{0}_{\lambda } = \partial _{\lambda } Id =0\) and that \(\hat A_{\lambda }\) is Hamiltonian and thus \(J\hat A_{\lambda } = -\hat A_{\lambda }^{*} J \) to cancel the first and third term. It remains to check \(J\partial _{\lambda } \hat {A}_{\lambda }\). It is straightforward to see that it is a diagonal matrix with just a non-zero entry; thus, it is either non-negative or non-positive. So λγ is either non-positive or non-negative being the integral of a non-positive or non-negative quantity (the sign is independent of ξ).

Now, the statement follows from inequality (16). □

We are finally ready to compute the asymptotic for Qj when the matrix Aj is constant. The next Proposition translates the estimate on the counting functions ηs and ωs defined in Proposition 1 to an estimate for the eigenvalues.

Proposition 2

Let Qj be any of the forms appearing in (11).

  • Suppose j = 2k and \(Q_{2k}(v) = {{\int \limits }_{0}^{1}} \langle A_{2k}v_{k}, v_{k} \rangle dt\) with A2k symmetric and constant and let Σ2k be its spectrum. Define

    $$\xi_{+} = \left( {\sum}_{\mu \in {\Sigma}_{2k},\mu>0} \sqrt[j]{\mu} \right)^{j} \text{ and } \xi_{-} =\left( {\sum}_{\mu \in {\Sigma}_{2k},\mu<0} \sqrt[j]{\vert\mu\vert} \right)^{j}. $$

    Then, Q2k has capacity (ξ+,ξ) with remainder of order one. Moreover, if A2k is m × m and \(r \in \mathbb {N}\), for rmk

    $$ \frac{\xi_{+}}{\pi^{j} (r-2mk-p(r))^{j}} \ge \lambda_{r} \ge\frac{\xi_{+}}{\pi^{j}(r+2mk+p(r))^{j}} $$
    (17)

    where p(r) = 0 if r is even or p(r) = 1 if r is odd. Similarly for negative r with ξ.

  • Suppose j = 2k + 1 and \(Q_{2k+1}(v) = {{\int \limits }_{0}^{1}} \langle A_{2k+1}v_{k-1}, v_{k} \rangle dt\) with A2k+ 1 skew-symmetric and constant and let Σ2k+ 1 be its spectrum. Define

    $$\xi = \left( {\sum}_{\mu \in {\Sigma}_{2k+1},-i\mu>0} \sqrt[j]{-i\mu} \right)^{j}.$$

    Then, Q2k+ 1 has capacity ξ with remainder of order one. Moreover, if A2k is m × m and \(r \in \mathbb {Z}\), for |r|≥ mk

    $$ \frac{\xi}{\pi^{j} (r-2mk-p(r))^{j}} \ge \lambda_{r} \ge\frac{\xi}{\pi^{j}(r+2mk+p(r))^{j}} . $$
    (18)

Proof

First of all we, consider 1 −dimensional system and we write the inequality |ηsωs| as an inequality for the eigenvalues. Notice that if we have two integer valued function \(f,g : \mathbb {R}\to \mathbb {N}\) and an inequality of the form:

$$ g(s)\ge \#\{\lambda \text{ solutions of (15) }: \lambda \ge s \} \ge f(s), $$

it means that we have at least f(s) solutions bigger than s and at most g(s). This implies that the sequence of ordered eigenvalues satisfies:

$$ \lambda_{f(s)} \ge s, \quad \lambda_{g(s)}\le s. $$

Now, we compute this quantities explicitly. In virtue of Proposition 1, we can take as upper/lower bounds for the counting function g(s) = ηs + 2k and f(s) = ηs − 2k. We choose the point \(s = \frac {\mu }{(2\pi r)^{j}}\). It is straightforward to see that:

$$ \eta_{s}\vert_{s =\frac{\mu}{(2\pi r)^{j}}} = 2 \#\left\{l \in \mathbb{N} : \frac{\mu}{(2\pi l)^{j}}\ge\frac{\mu}{(2\pi r)^{j}}\right\} = 2 r. $$

And thus we obtain:

$$ \lambda_{2(r-k)} \ge \frac{\mu}{(2\pi r)^{j}}, \quad \lambda_{2(r+k)}\le \frac{\mu}{(2\pi r)^{j}}. $$

Now, if we change the labelling, we find that , for lk:

$$ \frac{\mu}{(2\pi (l-k))^{j}}\ge \lambda_{2l} \ge \frac{\mu}{(2\pi (l+k))^{j}}. $$

By definition λ2lλ2l+ 1λ2l+ 2 and thus we have a bound for any index \(r \in \mathbb {N}\).

Now, we consider m −dimensional system; notice that we reduced the problem, via diagonalization, to the sum of m1 −dimensional systems. Thus, our form Qj is always a direct sum of 1 − dimensional objects. We show now how to recover the desired estimate for the sum of quadratic forms.

First of all, observe that counting functions are additive with respect to direct sum. In fact, if \(Q = \oplus _{i=1}^{m} Q_{i}\), λ is an eigenvalue of Q if and only if it is an eigenvalue of Qi for some i. We proceed as we did before. Suppose that Qa is 1 −dimensional and \(Q_{a} (v) = {{\int \limits }_{0}^{1}} \mu _{a} \vert v_{k}(t)\vert ^{2} dt \). Let us compute ηs in the point \(s_{0} = \left ({\sum }_{i=1}^{m} \sqrt [j]{\mu _{i}}\right )^{j}/(2\pi l)^{j}\):

$$ 2\#\left\{r\in\mathbb{N} : \frac{\mu_{a}}{(2\pi r)^{j}} \ge\frac {\left( {\sum}_{i=1}^{m} \sqrt[j]{\mu_{i}}\right)^{j}}{(2\pi l)^{j}} \right\} = 2\#\left\{r\in\mathbb{N} : \frac{\sqrt[j]{\mu_{a}}}{\left( {\sum}_{i=1}^{m} \sqrt[j]{\mu_{i}}\right)r } \ge\frac {1}{l}\right\} $$

Set for simplicity \(c_{a} = \frac {\sqrt [j]{\mu _{a}}}{\left ({\sum }_{i=1}^{m} \sqrt [j]{\mu _{i}}\right )}\), it is straightforward to see that the cardinality of the above set is \(\# \{r \in \mathbb {N}:r\le c_{a} l\} = \lfloor c_{a} l \rfloor \). Now, we are ready to prove the estimates for the direct sum of forms. Adding everything we have:

$$ 2{\sum}_{a=1}^{m} (\lfloor c_{a} l \rfloor +k) \ge \#\left\{ \text{eigenvalues of } Q \ge \frac {({\sum}_{i=1}^{m} \sqrt[j]{\mu_{i}})^{j}}{(2\pi l)^{j}} \right\} = 2{\sum}_{a=1}^{m}(\lfloor c_{a} l \rfloor -k) $$

It is clear that \({\sum }_{a=1}^{m} c_{a} =1\) and that \(l+mk \ge {\sum }_{a=1}^{m} (\lfloor c_{a} l \rfloor +k)\), similarly \({\sum }_{a=1}^{m} (\lfloor c_{a} l \rfloor +k) \ge l-m(k+1)\) since ⌊cal⌋≥ cal − 1. Rewriting for the eigenvalues with lmk we obtain:

$$ \frac{\left( {\sum}_{i=1}^{m} \sqrt[j]{\mu_{i}}\right)^{j}}{(2\pi(l-mk))^{j}} \ge \lambda_{2l} \ge\frac{\left( {\sum}_{i=1}^{m} \sqrt[j]{\mu_{i}}\right)^{j}}{(2\pi(l+mk))^{j}}. $$

It is straightforward to compute the bounds in (17) and (18) observing again λ2lλ2l+ 1λ2l+ 2. □

Remark 3

The shift m appearing in (17) and (18) is due to the fact we are considering the direct sum of m quadratic forms. It is worth noticing that this does not depend on the fact that we are considering a quadratic form on \(L^{2}([0,1],\mathbb R^{m})\) and the estimates in (17) and (18) hold whenever we consider the direct sum of m1 −dimensional forms with constant coefficients. This consideration will be used in the proof of Theorem 1 below.

Now, we prove some properties of the capacities which are closely related to the explicit estimate we have just proved for the linear case. As done so far, we state the proposition for ordered positive eigenvalues. An analogous statement is true for the negative ones.

Proposition 3

Suppose that Q is a quadratic form on an Hilbert space and let \(\{\lambda _{n}\}_{n \in \mathbb {N}}\) be its positive ordered eigenvalues. Suppose that:

$$ \lambda_{n} = \frac{\zeta}{n^{j}} + O(n^{-j-\nu}) \quad \nu >0, j \in \mathbb{N} \text{ as } n \to +\infty. $$
  1. 1.

    Then, for any such Qi on a Hilbert space \({\mathscr{H}}_{i}\), the direct sum \(Q = \oplus _{i=1}^{m} Q_{i}\) satisfies:

    $$ \lambda_{n} = \left( {\sum}_{i=1}^{m}\frac{\sqrt[j]{\zeta_{i}}}{n}\right)^{j} + O(n^{-j-\nu}) \quad \nu >0, j \in \mathbb{N} \text{ as } n \to +\infty. $$
  2. 2.

    Suppose that U is a subspace of codimension \(d<\infty \) then

    $$ \lambda_{n}(Q\vert_{U}) = \frac{\zeta}{n^{j}} + O(n^{-j-\nu}) \iff\lambda_{n}(Q) = \frac{\zeta}{n^{j}} + O(n^{-j-\nu}), $$

    as \( n \to +\infty \).

  3. 3.

    Suppose that Q and \(\hat {Q}\) are two quadratic forms. Suppose that Q is as at the beginning of the proposition and \(\hat {Q}\) satisfies:

    $$ \lambda_{n}(\hat{Q}) = O(n^{j+\mu}) \quad \mu >0, \text{ as } n \to + \infty. $$

    Then, the sum \(Q^{\prime } = Q+\hat {Q}\) satisfies:

    $$ \lambda_{n}(Q^{\prime}) = \frac{\zeta}{n^{j}} + O(n^{j+\nu^{\prime}}), \quad \nu^{\prime} = \min\left\{\frac{j+\mu}{j+\mu+1}(j+1),j+\nu\right\}. $$

Proof

The asymptotic relation can be written in terms of a counting function. Take the j th root of the eigenvalues of Qi, then it holds that

$$ \# \left\{n \in \mathbb{N} \vert 0\le \frac{1}{\sqrt[j]{\lambda_{n}}}\le k\right\} = \sqrt[j]{\zeta_{i}} k+ O(k^{1-\nu}) $$

So summing up all the contribution we get the estimate in i).

The min-max principle implies that we can control the n th eigenvalue of Q|U with the n th and (n + d)th eigenvalue of Q i.e.:

$$ \lambda_{n}(Q\vert_{U}) \le\lambda_{n}(Q)\le \lambda_{n-d}(Q\vert_{U}) \le\lambda_{n-d}(Q) $$

So, if the codimension is fixed, it is equivalent to provide and estimate for the eigenvalues Q or for those of Q|U.

For the last point we use Weyl law. We can estimate the i + j th eigenvalue of a sum of quadratic forms with the sum of the i th and the j th eigenvalues of the summands. Write, as in [3], \(Q^{\prime }\) as Q+\(\hat {Q}\) and Q as \(Q^{\prime } + (-\hat {Q})\). and choose i = n −⌊nδ⌋ and j = ⌊nδ⌋ in the first case and i = n and j = ⌊nδ⌋ in the second. This implies:

$$ \lambda_{n+\lfloor n^{\delta} \rfloor}(Q)+\lambda_{\lfloor n^{\delta} \rfloor}(\hat{Q}) \le \lambda_{n}(Q^{\prime})\le \lambda_{n-\lfloor n^{\delta} \rfloor}(Q)+\lambda_{\lfloor n^{\delta} \rfloor}(\hat{Q}) $$

The best remainder is computed as \(\nu ^{\prime } = \max \limits _{\delta \in (0,1)}\min \limits \{(j+\mu )\delta ,j+1-\delta ,j+\nu \}\). □

Collecting all the facts above we have the following estimate on the decaying of the eigenvalues of Qj, independently of any analyticity assumption of the kernel.

Proposition 4

Take Qj as in the decomposition of lemma Eq. 1. Then, the eigenvalues of Qj satisfy:

$$ \lambda_{n}(Q_{j}) = O\left( \frac{1}{n^{j}} \right) \quad \text{ as } n \to \pm \infty $$

Moreover, for any \(k \in \mathbb {N}\) and for any 0 ≤ sk, the forms Q2k+ 1 and Q2k have the same first term asymptotic as the forms:

$$ \begin{aligned} \hat{Q}_{2k+1,s}(v) = (-1)^{s}{{\int}_{0}^{1}} \langle A_{2k+1} v_{k+1+s}(t),v_{k-s}(t) \rangle dt \\ \hat{Q}_{2k,s}(v) = (-1)^{s}{{\int}_{0}^{1}} \langle A_{2k} v_{k+s}(t),v_{k-s}(t) \rangle dt \end{aligned} $$

Proof

Let us start with even case, j = 2k. It holds that:

$$ \vert Q_{2k}(v)\vert = {\vert{\int}_{0}^{1}}\langle A_{t} v_{k}(t),v_{k}(t)dt \vert \le C {{\int}_{0}^{1}} \langle v_{k}(t),v_{k}(t) \rangle dt $$

where \(C = \max \limits _{t} \vert \vert A_{t}\vert \vert \). By comparison with the constant coefficient case, we get the bound.

Suppose now that j = 2k − 1. As before there is a constant C such that

$$ \vert Q_{2k}(v)\vert = {\vert{\int}_{0}^{1}}\langle A_{t} v_{k}(t),v_{k+1}(t)dt \vert \le C \Vert v_{k} \Vert_{2} \Vert v_{k+1}\Vert_{2} $$

Consider now the following quadratic forms on \(L^{2}([0,1],\mathbb {R}^{k})\):

$$F_{k}(v) = {{\int}_{0}^{1}}\vert \vert v_{k}(t)\vert\vert^{2} dt=\Vert v_{k} \Vert_{2}^{2}, \quad F_{k+1}(v) = {{\int}_{0}^{1}}\vert \vert v_{k+1}(t)\vert\vert^{2} dt= \Vert v_{k+1}{\Vert_{2}^{2}}$$

Define \(V_{n} = \{v_{1}, \dots , v_{n}\}^{\perp }\) where vi are linearly independent eigenvectors of Fk associated to the first n eigenvalues \(\lambda _{1}\ge {\dots } \ge \lambda _{n}\). Similarly define \(U_{n} = \{u_{1}, \dots , u_{n}\}^{\perp } \) to be the orthogonal complement to the eigenspace associated to the first n eigenvalues of Fk+ 1. It follows that:

$$ \lambda_{2n}(Q_{2k+1})\le \max_{v \in V_{n}\cap U_{n}} C \Vert v_{k}\Vert_{2}\Vert v_{k+1}\Vert_{2} \le C \max_{v \in V_{n}} \Vert v_{k}\Vert_{2}\max_{v \in U_{n}} \Vert v_{k+1}\Vert_{2} $$

We already have an estimate for the eigenvalues of Fk and Fk+ 1 since we have already dealt with constant coefficients case. In virtue of the choice of the subspace Vn and Un, the maxima in the right hand side are the square roots of the nth eigenvalues of the respective forms. Thus, one gives a contribution of order nk and the other of order nk− 1 and the first part of the proposition is proved.

For the second part, without loss of generality suppose that j = 2k. The other case is completely analogous.

$$ \begin{array}{@{}rcl@{}} Q_{2k}(v) &=& {{\int}_{0}^{1}}\langle v_{k},A_{t} v_{k} \rangle dt = {{\int}_{0}^{1}} \langle v_{k}, {{\int}_{0}^{t}} A_{\tau} v_{k-1}(\tau)+\dot{A}_{\tau} v_{k}(\tau) d\tau \rangle dt\\ &=& - {{\int}_{0}^{1}} \langle v_{k+1}(t),A_{t}v_{k-1}(t)+{{\int}_{0}^{1}}\langle v_{k+1}(t),\dot{A}_{t} v_{k}(t)\rangle dt \end{array} $$

The second term above is of higher order by the first part of the lemma and so iterating the integration by parts on the first term at step s we get that:

$$ \begin{array}{@{}rcl@{}} {{\int}_{0}^{1}} \langle v_{k+s}(t),A_{t} v_{k-s}(t) \rangle dt &=& - {{\int}_{0}^{1}} \langle v_{k+s+1}(t),A_{t} v_{k-s-1}(t) \rangle dt\\ &&\quad+ {{\int}_{0}^{1}}\langle v_{k+s+1}(t), \dot{A}_{\tau} v_{k-s}(t) \rangle dt \end{array} $$

The second term of the right hand side is again of order n2k+ 1; this can be checked in the same way as in the first part of the proposition. This finishes the proof. □

Now, we prove the main result of this section:

Proof Proof of Theorem 1

Suppose that j = 2k is even. We work on \(V_{k} = \{v \in L^{2}([0,1],\mathbb {R}^{m}) : v_{j}(0) = v_{j}(1) =0, 0<j\le k\}\). Then

$$ Q(v) = Q_{2k}(v) + R_{k}(v) = {{\int}_{0}^{1}} \langle A_{t} v_{k}(t),v_{k}(t)\rangle dt + R_{k}(v) $$

Since the matrix At is analytic, we can diagonalize it piecewise analytically in t (see [11]). Thus, there exists a piecewise analytic orthogonal matrix Ot such that \(O_{t}^{*}A_{t}O_{t}\) is diagonal. By the second part of Proposition 4, if we make the change of coordinates vtOtvt we can reduce to study the direct sum of m 1 − dimensional forms. Without loss of generality, we consider forms of the type:

$$ Q_{2k}(v) = {{\int}_{0}^{1}}a_{t}\vert\vert v_{k}(t)\vert\vert^{2} dt ={{\int}_{0}^{1}}a_{t} v_{k}(t)^{2} dt $$

where now at is piecewise analytic and vk a scalar function.

For simplicity, we can assume that at does not change sign and is analytic on the whole interval. If that were not the case, we could just divide [0,1] in a finite number of intervals and study Q2k separately on each of them.

Suppose you pick a point t0 in (0,1) and consider the following subspace of codimension mk in Vk:

$$ V_{k}\supset V^{t_{0}}_{k} = \{v \in V_{k} : v_{j}(0) = v_{j}(t_{0}) = v_{j}(1) =0 , 0<j\le k\} $$

For tt0, define \(v_{j}^{t_{0}}: = {\int \limits }_{t_{0}}^{t}v^{t_{0}}_{j-1}(\tau )d\tau \) and v0 = vVk. It is straightforward to check that on \(V_{k}^{t_{0}}\) the form Q2k splits as a direct sum:

$$Q_{2k}(v) = {\int}_{0}^{t_{0}}\langle A_{t} v_{k}(t),v_{k}(t)\rangle dt +{\int}_{t_{0}}^{1}\langle A_{t} v^{t_{0}}_{k}(t),v^{t_{0}}_{k}(t)\rangle dt$$

Now, by Proposition 3 (points (i) and (ii)), we can introduce as many points as we want and work separately on each segment and the asymptotic will not change (as long as the number of point is finite).

Now, we fix a partition π of [0,1], \({\Pi } = \{t_{0} =0,t_{1} {\dots } t_{l-1},t_{l} =1\}\). Consider the subspace Vπ = {vL2|vs(ti) = vs(ti+ 1) = 0,0 < sk,ti ∈π} which has codimension equal to k|π|. Set \(a_{i}^{-} = \min \limits _{t \in [t_{i},t_{i+1}]}a_{t}\) and \(a_{i}^{+} = \max \limits _{t \in [t_{i},t_{i+1}]}a_{t}\). Finally, define \(v_{k}^{t_{i}}(t) = {\int \limits }_{t_{i}}^{t}{\dots } {\int \limits }_{t_{i}}^{\tau _{1}}v(\tau )d \tau {\dots } d \tau _{k-1} \). It follows immediately that on Vπ:

$$ {\sum}_{i} a^{-}_{i} {\int}_{t_{i}}^{t_{i+1}} v^{t_{i}}_{k}(t)^{2} dt \le Q_{2k}(v) \le {\sum}_{i} a^{+}_{i} {\int}_{t_{i}}^{t_{i+1}} v^{t_{i}}_{k}(t)^{2} dt $$

Now, we already analysed the spectrum for the problem with constant at on [0,1]. The last step to understand the quantities on the right and left hand side is to see how the eigenvalues rescale when we change the length of [0,1].

If we look back at the proof of Lemma 2, it is straightforward to check that the length is relevant only when we impose the boundary conditions, we find that the eigenvalues are: \(\lambda = \frac {a \ell ^{2k}}{(2\pi n)^{2k}}\) and again double.

In particular, the estimates in (17) and (18) are still true replacing μi with \(a_{i}^{\pm } \ell ^{2k}\).

If we replace now by |ti+ 1ti| and sum the capacities according to Proposition 3, we have the following estimate on the eigenvalues on Vπ, for n ≥ 2k|π|:

$$ \left( \frac{{\sum}_{i}(a_{i}^{-})^{\frac{1}{2k}}(t_{i+1}-t_{i})}{\pi (n+2\vert{\Pi} \vert k+p(n))}\right)^{2k} \le \lambda_{n}\left( Q_{2k}\vert_{V_{\Pi}} \right) \le \left( \frac{{\sum}_{i}(a_{i}^{+})^{\frac{1}{2k}}(t_{i+1}-t_{i})}{\pi(n-2\vert {\Pi} \vert k-p(n))}\right)^{2k} $$

Moreover, the min-max principle implies that, for nk|π|:

$$ \lambda_{n}\left( Q_{2k}\vert_{V_{\Pi}} \right) \le \lambda_{n}\left( Q_{2k}\right) \le \lambda_{n-k\vert{\Pi} \vert}\left( Q_{2k}\vert_{V_{\Pi}} \right) $$

In particular, for n ≥ 3k|π|, we have:

$$ \left( \frac{{\sum}_{i}(a_{i}^{-})^{\frac{1}{2k}}(t_{i+1}-t_{i})}{\pi (n+2\vert{\Pi} \vert k+p(n))}\right)^{2k} \le \lambda_{n}(Q_{2k} ) \le \left( \frac{{\sum}_{i}(a_{i}^{+})^{\frac{1}{2k}}(t_{i+1}-t_{i})}{\pi(n-3\vert {\Pi} \vert k-p(n))}\right)^{2k} $$
(19)

We address now the issue of the convergence of the Riemann sums. Set \(I^{\pm }_{a} = {\sum }_{i}\left (a_{i}^{\pm }\right )^{\frac {1}{2k}}(t_{i+1}-t_{i}) \) and \(I_{a} = {{\int \limits }_{0}^{1}}a^{\frac {1}{2k}}dt\). It is well known that \(I^{\pm }_{a} \to I_{a}\) as long as \(\sup _{i} \vert t_{i}-t_{i+1}\vert \) goes to zero. We need a more quantitative bound on the rate of convergence. Using results from [9] for and equispaced partition, we have that:

$$ \vert I_{a}-I^{\pm}_{a}\vert \le C^{\pm}_{a} \frac{1}{\vert{\Pi}\vert} = \frac{C(a,k,\pm)}{\text{codim}(V_{\Pi})} $$

where C(a,k,±) is a constant that depends only on the function a and on k and the inequality holds for |π|≥ n0 sufficiently large, where n0 depends just on a and k.

Consider the right hand side of (19), adding and subtracting \(\frac {I_{a}}{(\pi n)^{2k}}\), we find that for \( n\ge \max \limits \{n_{0},k\vert {\Pi }\vert \}\):

$$ \lambda_{n}(Q_{2k} ) \le \left( \frac{I_{a}}{ \pi n}\right)^{2k} + \left( \frac{I_{a}^{+}}{\pi(n-3\vert {\Pi} \vert k-p(n))}\right)^{2k}-\left( \frac{I_{a}}{ \pi n}\right)^{2k}. $$

A simple algebraic manipulation shows that there are constants C1,C2 and C3 such that the difference on the right hand side is bounded by

$$ \frac{C_{1} n^{2k} \vert {\Pi}\vert^{-1} + C_{2}(n^{2k}-\vert {\Pi}\vert^{2k}(n/\vert {\Pi}\vert-1)^{2k})}{ C_{3}(n-3k\vert {\Pi}\vert)^{2k} n^{2k}} $$

for \(n\ge \max \limits \{3k\vert {\Pi }\vert ,n_{1}\vert {\Pi }\vert ,n_{0}\}\) where n1 is a certain threshold independent of |π|.

The idea now is to choose for n a partition π of size |π| = ⌊nδ⌋ to provide a good estimate of λn(Q). The better result in terms of approximation is obtained for \(\delta = \frac {1}{2}\). Heuristically this can be explained as follows: on one hand the first piece of the error term is of order n− 2kδ, comes from the convergence of the Riemann sums and gets better as δ → 1. On the other hand the second term comes from the estimate on the eigenvalues and get worse and worse as nδ becomes comparable to n.

A perfectly analogous argument allows to construct an error function for the left side of (19) which decays as n− 2k− 1/2 for n sufficiently large.

We have proved so far that, for one dimensional forms, Q2k has 2k −capacity \(\xi _{+} = ({{\int \limits }_{0}^{1}} \sqrt [2k]{a_{t}}dt)^{2k}\). Now, we apply point (i) of Proposition 3 to obtain the formula in the statement for forms on \(L^{2}([0,1],\mathbb {R}^{m})\). Finally notice that by Proposition 4 the eigenvalues of Rk(v) decay as n− 2k− 1. If we apply point (iii) of Proposition 3, we find that Q2k(v) + Rk(v) has the same 2k −capacity as Q2k with remainder of order 1/2.

Now, we consider the case j = 2k − 1. The idea is to reduce to the case of j = 4k − 2 as in the proof of Lemma 2 and use the symmetries of Q2k− 1 to conclude. In the same spirit as in the beginning of the proof let us diagonalize the kernel A2k− 1. We thus reduce everything to the two dimensional case, i.e. to the quadratic forms:

$$ Q(v) = {{\int}_{0}^{1}} \langle v_{k}(t),\left( \begin{array}{cc} 0 & -a_{t} \\ a_{t} & 0 \end{array}\right) v_{k-1}(t) \rangle dt \quad a_{t} \ge 0 $$
(20)

It is clear that the map v0Ov0 where \(O = \left (\begin {array}{cc} 0 & 1 \\ 1 & 0 \end {array}\right )\) is an isometry of \(L^{2}([0,1],\mathbb {R}^{2})\) and Q(Ov0) = −Q(v0) and so the spectrum is two sided and the asymptotic is the same for positive and negative eigenvalues.

Now, we reduce the problem to the even case. Let us consider the square of Q2k− 1. By proposition Eq. 4Q2k− 1 has the same asymptotic as the form:

$$ \hat{Q}_{2k-1} = (-1)^{k+1}{{\int}_{0}^{1}} \langle A_{t} v_{2k-1}(t),v_{0}(t)\rangle dt \qquad F(v_{0})(t) = (-1)^{k+1} A_{t}v_{2k-1}(t) $$

So we have to study the eigenvalues of the symmetric part of F. It is clear that:

$$ \frac{(F+F^{*})^{2}}{4} = \frac{F^{2}+ F F^{*}+F^{*}F+(F^{*})^{2}}{4} $$

Thus, we have to deal with the quadratic form:

$$ \begin{array}{@{}rcl@{}} 4\tilde Q(v) &=& \langle [2F^{2}+F^{*}F + FF^{*}](v),v\rangle \\ &=& 2 \langle F(v),F^{*}(v)\rangle + \langle F^{*}(v),F^{*}(v)\rangle + \langle F(v),F(v)\rangle \end{array} $$

The last term is the easiest to write, it is just:

$$ \langle F(v),F(v) \rangle = {{\int}_{0}^{1}} \langle -{A_{t}^{2}} v_{2k-1}(t),v_{2k-1}(t) \rangle dt $$

which is precisely of the form of point (i) and gives \(\frac {1}{4}\) of the desired asymptotic. The operator F acts as follows:

$$ F^{*}(v)=(-1)^{k+1} {{\int}_{0}^{t}}{\int}_{0}^{t_{2k-1}}{\dots} {\int}_{0}^{t_{1}} A_{t_{1}} v_{0}{(t_{1})} dt_{1} {\dots} dt_{2k-1} $$

Using integration by parts one can single out the term Atv2k− 1. To illustrate the procedure, for k = 1 one gets:

$$ \begin{array}{@{}rcl@{}} F^{*}(v) &=& A_{t} v_{1}(t)-{{\int}_{0}^{t}} \dot{A}_{\tau} v_{1}(\tau) d\tau\\ \langle F^{*}(v),F^{*}(v) \rangle &=& {{\int}_{0}^{1}} \langle -{A_{t}^{2}} v_{1}(t),v_{1}(t) \rangle dt + 2{{\int}_{0}^{1}} \langle A_{t} v_{1}(t),{{\int}_{0}^{t}} \dot{A}_{\tau} v_{1}(\tau) d\tau\rangle dt+\\ && +{{\int}_{0}^{1}} \langle {{\int}_{0}^{t}} \dot{A}_{\tau} v_{1}(\tau) d\tau,{{\int}_{0}^{t}} \dot{A}_{\tau} v_{1}(\tau) d\tau\rangle dt \end{array} $$

The other terms thus do not affect the asymptotic since by Proposition 4 they decay at least as O(n3). The proof goes on the same line for general k.

The same reasoning applies to the term 〈F(v),F(v)〉. Summing everything one gets that the leading term is \({{\int \limits }_{0}^{1}} \langle -{A_{t}^{2}} v_{2k-1}(t),v_{2k-1}(t) \rangle dt\) and so this is precisely the same case as point (i). Recall that At is a 2 × 2 skew-symmetric matrix as defined in (20); thus, the eigenvalues of the square coincide and are \({a_{t}^{2}}\). It follows that, for n sufficiently large, the square of the eigenvalues of \(\tilde {Q}\) satisfy:

$$ \lambda_{n}(\tilde Q) = \frac{\left( {{\int}_{0}^{1}} 2 \sqrt[4k-2]{{a_{t}^{2}}} dt\right)^{4k-2}}{\pi^{4k-2}n^{4k-2}} + O(n^{-4k-2-\frac{1}{2}}) $$

It is immediate to see that \(\frac {\left ({{\int \limits }_{0}^{1}}2 \sqrt [4k-2]{{a_{t}^{2}}} dt\right )^{4k-2}}{(\pi n)^{4k-2}} = \frac {\left ({{\int \limits }_{0}^{1}} \sqrt [2k-1]{a_{t}} dt\right )^{4k-2}}{(\pi n/2)^{4k-2}}\). This mirrors the fact that the spectrum of Q2k− 1 is double and any couple λ,−λ is sent to the same eigenvalue λ2. Thus, the (2k − 1) −capacity of Q2k− 1 is \(\left ({{\int \limits }_{0}^{1}} \sqrt [2k-1]{a_{t}} dt\right )^{2k-1}\).

Moreover, given two sequences \(\{a_{n}\}_{n \in \mathbb {N}}\) and \(\{b_{n}\}_{n \in \mathbb {N}}\), \(\sqrt {{a_{n}^{2}}+{b_{n}^{2}}} = a_{n}\sqrt {1+\frac {{b_{n}^{2}}}{{a_{n}^{2}}}} \approx a_{n}\left (1+ \frac {b_{n}}{a_{n}}+O\left (\frac {b_{n}}{a_{n}}\right )\right )\) so the remainder is still \(2k-1+\frac {1}{2}\).

Arguing again by point (i) of Proposition 3 one gets the estimate in the statement.

The last part about the \(\infty -\)capacity follow just by Proposition 4. If Aj ≡ 0 for any j then for any \(\nu \in \mathbb {R}\), ν > 0 we have λnnν → 0 as \(n \to \pm \infty \). □

4 Proof of Theorem 2

Proof Proof of Theorem 2

The proof of the first part of the statement follows from a couple of elementary considerations. In the sequel, we will use the short-hand notation \(\mathcal {A}\) for Skew(K).

  1. Fact 1:

    Equation (1) holds if and only if \(\mathcal {A}\) has finite rank

Suppose that \(K\vert _{\mathcal {V}}\) is symmetric. Consider the orthogonal splitting of L2[0,1] as \(\mathcal {V} \oplus \mathcal {V}^{\perp }\). Equation (1) can be reformulated as \(\mathcal {A}(\mathcal {V})\subseteq \mathcal {V}^{\perp }\), thus \(\text {Im}(\mathcal {A}(L^{2}[0,1]))\subseteq \mathcal {V}^{\perp } + \mathcal {A}(\mathcal {V}^{\perp } ) \) which is finite dimensional.

Conversely, if the range of \(\mathcal {A}\) is finite dimensional, we can decompose L2[0,1] as \(\text {Im}(\mathcal {A})\oplus \ker (\mathcal {A})\), where the decomposition is orthogonal by skew-symmetry. Thus, on \(\ker (\mathcal {A})\), K is symmetric.

  1. Fact 2:

    \(\mathcal {A}\) determines the kernel of K

It is well known that, if K is Hilbert-Schmidt, then K is Hilbert-Schmidt too. Since we are assuming (2) it is given by:

$$ K^{*}(v)(t) = {{\int}_{t}^{1}} V^{*}(\tau,t) v(\tau) d\tau. $$

So we can write down the integral kernel A(t,τ) of \(\mathcal {A}\) as follows:

$$ A(t,\tau) = \left\{\begin{array}{l} \frac{1}{2} V(t,\tau) \text{ if } \tau <t \\ -\frac{1}{2} V^{*}(\tau,t) \text{ if } t <\tau. \end{array}\right. $$

The key observation now is that the support of the kernel of K is disjoint form the support of the kernel of K. Thus, the kernel of \(\mathcal {A}\) determines the kernel of K (and vice versa).

Now, since we are assuming that \(\mathcal {A}\) has finite dimensional image, we can present its kernel as:

$$ A(t,\tau) = \frac{1}{2} Z_{t}^{*} \mathcal{A}_{0} Z_{\tau} , $$

where \(\mathcal {A}_{0}\) is a skew-symmetric matrix and Zt is a \( \dim (\text {Im}(\mathcal {A})) \times k\) matrix that has as rows the elements of some orthonormal base of \(\text {Im}(\mathcal {A})\). Without loss of generality we can assume \(\mathcal {A}_{0} = J\). In fact, with an orthogonal change of coordinates, \(\mathcal {A}_{0}\) decomposes as a direct sum of rotation with an amplitude λi. Rescaling the coordinates by \(\sqrt {\lambda _{i}}\) yields the desired canonical form J.

The first part of the statement is proved so we pass to second one. First of all notice that, now that we have written down any operator satisfying (1) and (2) in the same form as those in (3), we can apply all the results about the asymptotic of their eigenvalues. In particular, if we assume that the space \( \text {Im}(\mathcal {A}) \subset L^{2}([0,1],\mathbb R^{k})\) is generated by piecewise analytic functions, the ordered sequence of eigenvalues satisfies:

$$\lambda_{n} = \frac{\xi}{\pi n} + O(n^{-5/3}), \quad \text{ as } n \to \pm \infty.$$

Notice that we are using a better estimates on the reminder (for the case of the 1 −capacity) then the one given in Theorem 1 that was given in [3]. We denote by \(M^{\dagger } = \bar M^{*}\) the conjugate transpose. Set \(2m = \dim (\text {Im}(\mathcal {A}))\), since the map tZt is analytic, there exists a piecewise analytic family of unitary matrices Gt such that:

$$ G_{t}^{\dagger} Z_{t}^{*}JZ_{t}G_{t} = \left[\begin{array}{cccccccc} &i\zeta_{1}(t)\\ &&\ddots\\ &&&i\zeta_{l}(t)\\ &&&&-i\zeta_{1}(t)\\ &&&&&\ddots\\ &&&&&&-i\zeta_{l}(t)\\ &&&&&&&\underline{0} \end{array}\right] $$

Without loss of generality we can assume that the function ζi are analytic on the whole interval and everywhere non-negative. Recall that the coefficient ξ appearing in the asymptotic was computed as \(\xi = {{\int \limits }_{0}^{1}} \zeta (t)dt = {{\int \limits }_{0}^{1}} {\sum }_{i=0}^{l} \zeta _{i}(t) dt\).

Let us work on the Hilbert space \(L^{2}([0,1],\mathbb {C}^{k})\) with standard hermitian product. Notice that \(G: L^{2}([0,1],\mathbb {C}^{k} )\to L^{2}([0,1],\mathbb {C}^{k})\), vGtv is an isometry; thus, the eigenvalue of \(Skew(K)=\mathcal {A}\) remains the same if we consider the similar operator \(G^{-1} \circ \mathcal {A} \circ G\) which acts as follows:

$$ G^{-1} \circ \mathcal{A} \circ G (v) = \frac{1}{2} {G}_{t}^{\dagger} Z_{t}^{*}J{{\int}_{0}^{1}} Z_{\tau}G_{\tau}v(\tau) d\tau $$

To simplify notation let us forget about this change of coordinates and still call Zt the matrix ZtGt. Write Zt as:

$$Z_{t}= \left( \begin{array}{c} y^{*}_{1}(t)\\ \vdots\\ y_{m}^{*}(t)\\ x_{1}^{*}(t)\\ \vdots\\ x^{*}_{m}(t) \end{array}\right).$$

We introduce the following notation: for a vector function vi the quantity (vi)j stands for j th component of vi.

We can now bound the function ζ(t) in terms of the components of the matrix Zt:

$$ \begin{array}{@{}rcl@{}} 2 \zeta (t) &=& {\sum}_{j=1}^{k} \vert(Z_{t}^{\dagger} JZ_{t})_{jj}\vert \le {\sum}_{i=1}^{m}{\sum}_{j=1}^{k} \vert(x_{i})_{j} (\bar{y}_{i})_{j}-(y_{i})_{j}(\bar{x}_{i})_{j}\vert(t)\\ &=&{\sum}_{i=1}^{m}{\sum}_{j=1}^{k} 2\vert\text{Im}((x_{i})_{j}(\bar{y}_{i})_{j})\vert \le {\sum}_{i=1}^{m}{\sum}_{j=1}^{k} 2\vert(x_{i})_{j}\vert\vert(y_{i})_{j}\vert = {\sum}_{i=1}^{m} 2\langle \vert x_{i}\vert,\vert y_{i}\vert \rangle(t) \end{array} $$

where the vector |v| is the vector with entries the absolute values the entries of v. Integrating and using Hölder inequality for the 2 norm, we get:

$$ \xi = {{\int}_{0}^{1}} \zeta(t) dt = {\sum}_{i=1}^{m} \vert \vert x_{i}\vert \vert_{2} \vert\vert y_{i} \vert\vert_{2}. $$

The next step is to relate the quantity on the right hand side to the eigenvalues of \(\mathcal {A}\). The strategy now is to modify the matrix Zt in order to get an orthonormal frame of \(Im(\mathcal {A})\). Keeping track of the transformations used we get a matrix representing \(\mathcal {A}\), then it is enough to compute the eigenvalues of the said matrix.

We can assume, without loss of generality that \(\langle x_{i},x_{j} \rangle _{L^{2}} =\delta _{ij}\). This can be achieved with a symplectic change of the matrix Zt. Then, we modify the yj in order to make them orthogonal to the space generated by the xj. We use the following transformation:

$$ \left( \begin{array}{c} Y_{t} \\X_{t} \end{array}\right) \mapsto \left( \begin{array}{cc} 1 & M \\ 0& 1 \end{array}\right) \left( \begin{array}{c} Y_{t} \\X_{t} \end{array}\right) = \left( \begin{array}{c} Y_{t}+MX_{t} \\X_{t} \end{array}\right) $$

where M is defined by the relation \({{\int \limits }_{0}^{1}} Y_{t}X_{t}^{*} +MX_{t}X_{t}^{*} dt ={{\int \limits }_{0}^{1}} Y_{t}X_{t}^{*}dt +M=0\). The last step is to make yj orthonormal. If we multiply Yt by a matrix L we find the equation \(L{{\int \limits }_{0}^{1}} Y_{t}Y_{t}^{*} dtL^{*} = 1\), so \(L = ({{\int \limits }_{0}^{1}}Y_{t}Y_{t}^{*}dt)^{-\frac {1}{2}}\). Thus, the matrix representing \(\mathcal {A}\) in this coordinates is one half of:

$$ \mathcal{A}_{0} = \left( \begin{array}{cc} L^{-1} &0\\ -M^{*} &1 \end{array}\right) \left( \begin{array}{cc} 0 &-1\\ 1 &0 \end{array}\right) \left( \begin{array}{cc} L^{-1} &-M \\ 0 &1 \end{array}\right) = \left( \begin{array}{cc} 0 &L^{-1}\\ -L^{-1}& M^{*}-M \end{array}\right) $$

If we square \(\mathcal {A}_{0}\) and compute the trace, we get:

$$ -\frac{1}{2}\text{tr}\left( {\mathcal{A}_{0}^{2}}\right) = \text{tr}(L^{-2})-\frac{1}{2} \text{tr} ((M^{*}-M)^{2})\ge \text{tr}\left( {{\int}_{0}^{1}}Y_{t}Y_{t}^{*} dt\right) = {\sum}_{i=1}^{m} \vert \vert y_{i} \vert {\vert_{2}^{2}} $$

Call \({\Sigma }(\mathcal {A})\) the spectrum of \(\mathcal {A}\), since \(\mathcal {A}\) is skew-symmetric it follows that:

$$ -\frac{1}{2}\text{tr}({\mathcal{A}_{0}^{2}}) = 4 {\sum}_{\mu \in {\Sigma}(\mathcal{A}), -i\mu>0 } -\mu^{2} \ge0. $$

Recalling that ||xi|| = 1 and putting all together we find that:

$$ \xi \le {\sum}_{i=1}^{m} \vert \vert y_{i} \vert \vert_{2} \le \sqrt{m} \sqrt{{\sum}_{i=1}^{m} \vert \vert y_{i} \vert {\vert_{2}^{2}}}= 2\sqrt{m} \sqrt{{\sum}_{\mu \in {\Sigma}(\mathcal{A}), -i\mu>0 } -\mu^{2}}. $$

Example 1

Consider a matrix Zt of the following form:

$$ Z_{t} = \left[\begin{array}{cc} \xi_{1}(t) &\xi_{3}(t) \\0 &\xi_{2}(t) \end{array}\right] \quad Z_{t}^{*}JZ_{t} = \left[ \begin{array}{cc} 0 & -{\xi}_{1}\xi_{2}(t) \\ {\xi}_{2}\xi_{1}(t) & 0 \end{array}\right] $$

The capacity of K is given by \(\zeta = {{\int \limits }_{0}^{1}} \vert \xi _{1} \xi _{2} \vert (t) dt\). We can assume that 〈ξ2,ξ3〉 = 0 and ||ξ2|| = 1. A direct computation shows that the eigenvalue of SkewK are \(\frac {\pm i}{2} \sqrt {(\vert \vert \xi _{1}\vert \vert ^{2}+\vert \vert \xi _{3}\vert \vert ^{2})}\). This shows that the two quantities behave in a very different way. If we choose ξ2 very close to ξ1 and ξ3 small, capacity and eigenvalue square are comparable. If we choose ξ3 to be very big, the capacity remains the same whereas the eigenvalues explode. In particular, there cannot be any lower bound of ζ in terms of the eigenvalues of K.

Remark 4

There is a natural class of translations that preserves the capacity. Take any path Φt of symplectic matrices (say L2 integrable), the operators constructed with Zt and ΦtZt have the same capacity (but the respective skew-symmetric part clearly do not have the same eigenvalues).

Set \(K^{\Phi }(v) = {{\int \limits }_{0}^{t}}Z_{t}^{*}J {\Phi }_{t}^{-1}{\Phi }_{\tau } Z_{\tau } v_{\tau } d \tau \) and Σ+(KΦ) the set of eigenvalues of Skew(KΦ) satisfying − iσ ≥ 0. It seems natural to ask if:

$$ \zeta(K) = 2\inf_{{\Phi}_{t} \in Sp(n)} \sqrt{{\sum}_{\sigma \in {\Sigma}^{+}(K^{\Phi})}-\sigma^{2}} $$

Take for instance the example above and suppose for simplicity that ξ1 and ξ2 are positive and never vanishing. Using the following transformation we obtain:

$$ Z^{\prime}_{t} =\left[ \begin{array}{cc} \sqrt{\frac{\xi_{2}}{\xi_{1}}} &\frac{-\xi_{3}}{\sqrt{\xi_{1}\xi_{2}}} \\ 0 & \sqrt{\frac{\xi_{1}}{\xi_{2}}} \end{array}\right] \left[\begin{array}{cc} \xi_{1} & \xi_{3} \\ 0 &\xi_{2} \end{array}\right] = \left[\begin{array}{cc} \sqrt{\xi_{1}\xi_{2}} & 0 \\ 0 &\sqrt{\xi_{1}\xi_{2}} \end{array}\right] $$

and in this case the eigenvalue became \(\frac {\pm i}{2}\langle \xi _{1},\xi _{2}\rangle \), precisely half the capacity.

5 The Second Variation of an Optimal Control Problem

We start this section collecting some basic fact about optimal control problems, first and second variation. Standard references on the topic are [3, 4, 7, 10] and [8].

5.1 Symplectic Geometry and Optimal Control Problems

Consider a smooth manifold M, its cotangent bundle TM is a vector bundle on M whose fibre at a point q is the vector space of linear functions on TqM, the tangent space of M at q.

Let π be the natural projection, π : TMM which takes a covector and gives back the base point:

$$ \pi :T^{*}M \to M, \quad \pi(\lambda_{q}) = q. $$

Using the the projection map we define the following 1 −form, called tautological (or Liouville ) form: take an element XTλ(TM), sλ(X) = λ(πX). One can check that σ = ds is not degenerate in local coordinates. We obtain a symplectic manifold considering (TM,σ).

Using the symplectic form we can associate to any function on TM a vector field. Suppose that H is a smooth function on TM, we define \(\vec H\) setting:

$$ \sigma(X,\vec H_{\lambda} ) = d_{\lambda} H (X), \quad \forall X \in T_{\lambda}(T^{*}M) $$

H is called Hamiltonian function and \(\vec {H}\) is an Hamiltonian vector field.

On TM we have a particular instance of this construction which can be used to lift arbitrary flows on the base manifold M to Hamiltonian flows on TM. For any vector field V on M consider the following function:

$$ h_{V}(\lambda) = \langle \lambda, V \rangle, \quad \lambda \in T^{*}M. $$

It is straight forward to check in local coordinates that \(\pi _{*} \vec {h}_{V} = V\).

The next objects we are going to introduce are Lagrangian subspaces. We say that a subspace W of a symplectic vector space (Σ,σ) is Lagrangian if the restriction of the symplectic form σ is degenerate, i.e. if {v ∈Σ : σ(v,w) = 0,∀wW} = W. An example of Lagrangian subspaces is the fibre, i.e. the kernel of π. More generally we can consider the following submanifolds in TM:

$$ A(N)=\{\lambda \in T^{*}M : \lambda (X) =0, \forall X \in TN, \pi(\lambda)\in N\} $$

where NM is a submanifold. A(N) is called the annihilator of N and its tangent space at any point is a Lagrangian subspace.

Suppose we are given a family of complete and smooth vector fields fu which depend on some parameter \(u \in U \subset \mathbb {R}^{k}\) and a Lagrangian, i.e. a smooth function φ(u,q) on U × M. We use the vector fields fu to produce a family of curves on M. For any function \(u\in L^{\infty }([0,1],U)\) we consider the following non-autonomous ODE system on M:

$$ \dot{q} = f_{u(t)}(q), \quad q(0) = q_{0} \in M $$
(21)

The solution are always Lipschitz curves. For fixed q0, the set of functions \(u \in L^{\infty }([0,1],U)\) for which said curves are defined up to time 1 is an open set which we call \(\mathcal U_{q_{0}}\). We can let the base point q0 vary and consider \(\mathcal U = \cup _{q_{0}\in M} \mathcal {U}_{q_{0}}\). It turns out that this set has a structure of a Banach manifold (see [6]). We call the \(L^{\infty }\) functions obtained this way admissible controls and the corresponding trajectories on Madmissible curves.

Denote by γu the admissible curve obtained form an admissible control u. We are interested in the following minimization problem on the space of admissible controls:

$$ \min_{u \text{ admissible}} \mathcal{J}(u) = \min_{u \text{ admissible}} {{\int}_{0}^{1}} \varphi(u(t),\gamma_{u}(t))dt $$
(22)

We often reduce the space of admissible variations imposing additional constraints on the final and initial position of the trajectory. For example, one can consider trajectories that start and end at two fixed points q0,q1M, or trajectory that start from a submanifold N0 and reach a second submanifold N1. More generally we can ask that the curves satisfy \((\gamma (0), \gamma (1))\in N\subseteq M\times M\).

We often consider the following family of functions on TM:

$$ h_{u}: T^{*}M \to \mathbb R, \quad h_{u}(\lambda) = \langle \lambda,f_{u}\rangle +\nu \varphi(u,\pi(\lambda)). $$

We use them to lift vector fields on M to vector fields on TM. They are closely relate with the function defined above and still satisfy \(\pi _{*}(\vec {h}_{u}) = f_{u}\).

In particular, if \(\tilde \gamma \) is and admissible curve, we can build a lift, i.e. a curve \(\tilde \lambda \) in TM such that \(\pi (\tilde \lambda ) = \tilde \gamma \), solving \(\dot \lambda = \vec h_{u}(\lambda )\). The following theorem, known as Pontryagin Maximum Principle, gives a characterization of critical points of \(\mathcal {J}\), for any set of boundary conditions.

Theorem (PMP)

If a control \(\tilde u\in L^{\infty }([0,1],U)\) is a critical point for the functional in (22) there exists a curve λ : [0,1] → TM and an admissible curve q : [0,1] → M such that for almost all t ∈ [0,1]

  1. 1.

    λ(t) is a lift of q(t):

    $$ q(t) = \pi (\lambda(t)); $$
  2. 2.

    λ(t) satisfies the following Hamiltonian system:

    $$ \frac{d \lambda}{dt} = \vec{h}_{\tilde u(t)}(\lambda); $$
  3. 3.

    the control \(\tilde u\) is determined by the maximum condition:

    $$ h_{\tilde u(t)}(\lambda(t)) = \max_{u\in U} h_{u}(\lambda(t)), \quad \nu\le0; $$
  4. 4.

    the non-triviality condition holds: (λ(t),ν)≠(0,0);

  5. 5.

    transversality condition holds:

    $$(-\lambda(0),\lambda(1)) \in A(N).$$

We call q(t) an extremal curve (or trajectory) and λ(t) an extremal.

There are essentially two possibility for the parameter ν, it can be either 0 or, after appropriate normalization of λt, − 1. The extremals belonging to the first family are called abnormal whereas the ones belonging to second normal.

5.2 The Endpoint Map and its Differentiation

We will consider now in detail the minimization problem in equation (22) with fixed endpoints.

As in the previous section we denote by \(\mathcal {U}_{q_{0}}\subset L^{\infty }([0,1],U)\) be the space of admissible controls at point q0 and define the following map:

$$ E^{t}: \mathcal U_{q_{0}} \to M,\quad u\mapsto \gamma_{u}(t) $$

It takes the control u and gives the position at time t of the solution of (21) starting from q0. We call this map Endpoint map. It turns out that Et is smooth, we are going now to compute its differential and Hessian. The proof of these facts can be found in the book [7] or in [1].

For a fixed control \(\tilde u \) consider the function \(h_{\tilde {u}}(\lambda ) = h_{\tilde {u}(t)}(\lambda )\) and define the following non-autonomous flow which plays the role of parallel transport in this context:

$$ \frac{d}{dt} \tilde{\Phi}_{t} = \vec{h}_{\tilde{u}}(\tilde{\Phi}_{t}) \qquad \tilde{\Phi}_{0} = Id $$
(23)

It has the following properties:

  • It extends to the cotangent bundle the flow which solves \(\dot {q} = f^{t}_{\tilde {u}}(q)\) on the base. In particular, if λt is an extremal with initial condition λ0, \(\pi (\tilde {\Phi }_{t}(\lambda _{0})) = q_{\tilde {u}}(t)\) where \(q_{\tilde {u}}\) is an extremal trajectory.

  • \(\tilde {\Phi }_{t}\) preserves the fibre over each qM. The restriction \(\tilde {\Phi }_{t}: T^{*}_{q}M \to T^{*}_{\tilde {\Phi }_{t}(q)}M \) is an affine transformation.

We suppose now that λ(t) is an extremal and \(\tilde {u}\) a critical point of the functional \(\mathcal {J}\). We use the symplectomorphism \(\tilde {\Phi }_{t}\) to pull back the whole curve λ(t) to the starting point λ0. We can express all the first and second order information about the extremal using the following map and its derivatives:

$$ {b_{u}^{t}}(\lambda) = ({h_{u}^{t}}-h_{\tilde{u}}^{t})\circ \tilde{\Phi}_{t}(\lambda) $$

Notice that:

  • \({b_{u}^{t}}(\lambda _{0})\vert _{u =\tilde {u}(t)} =0 = d_{\lambda _{0}} {b_{u}^{t}}\vert _{u =\tilde {u}(t)}\) by definition.

  • \(\partial _{u} {b_{u}^{t}}\vert _{u =\tilde {u}(t)} = \partial _{u} \left ({h_{u}^{t}}\circ \tilde {\Phi }_{t}\right )\vert _{u =\tilde {u}(t)} =0\) since λ(t) is an extremal and \(\tilde {u}\) the relative control.

Thus, the first non-zero derivatives are the order two ones. We define the following maps:

$$ \begin{array}{r} Z_{t} = \partial_{u} \vec{b}_{u}^{t}(\lambda_{0})\vert_{u=\tilde{u}(t)} : \mathbb{R}^{k} = T_{\tilde{u}(t)}U \to T_{\lambda_{0}}(T^{*}M) \\ H_{t} = {\partial_{u}^{2}} b_{t}(\lambda_{0})\vert_{u=\tilde{u}(t)} : \mathbb{R}^{k} =T_{\tilde{u}(t)}U \to T^{*}_{\tilde{u}(t)}U =\mathbb{R}^{k} \end{array} $$
(24)

We denote by \({\Pi }=\ker \pi _{*}\) the kernel of the differential of the natural projection π : TMM.

Proposition 5 (Differential of the endpoint map)

Consider the endpoint map \(E^{t}: \mathcal {U}_{q_{0}} \to M\). Fix a point \(\tilde {u}\) and consider the symplectomorphism \(\tilde {\Phi }_{t}\) and the map Zt defined above. The differential is the following map:

$$ d_{\tilde{u}} E (v_{t}) =d_{\lambda(t)} \pi \circ d_{\lambda_{0}}\tilde{\Phi}_{t}\left( {{\int}_{0}^{t}} Z_{\tau} v_{\tau} d\tau\right) \in T_{q_{t}}M $$

In particular, if we identify \(T_{\lambda _{0}}(T^{*} M)\) with \(\mathbb {R}^{2m}\) and write \(Z_{t} = \left (\begin {array}{c} Y_{t} \\ X_{t} \end {array}\right )\), \(\tilde {u}\) is a regular point if and only if \(v_{t}\mapsto {{\int \limits }_{0}^{t}} X_{\tau } v_{\tau } d\tau \) is surjective. Equivalently if the following matrix is invertible:

$$ {\Gamma}_{t} = {{\int}_{0}^{t}} X_{\tau} X^{*}_{\tau} d\tau \in Mat_{n\times n}(\mathbb{R}), \quad \det({\Gamma}_{t})\ne 0 $$

If \(d_{\tilde {u}} E^{t}\) is surjective, then (Et)− 1(qt) is smooth in a neighbourhood of \(\tilde {u}\) and its tangent space is given by:

$$ \begin{array}{@{}rcl@{}} T_{\tilde{u}}(E^{t})^{-1}(q_{t}) &=& \{v \in L^{\infty}([0,1],\mathbb{R}^{k}) : {{\int}_{0}^{t}} X_{\tau} v_{\tau} d \tau =0\} \\ &=& \{v \in L^{\infty}([0,1],\mathbb{R}^{k}) : {{\int}_{0}^{t}} Z_{\tau} v_{\tau} d \tau \in {\Pi}\} \end{array} $$

When the differential of the Endpoint map is surjective a good geometric description of the situation is possible. The set of admissible control becomes smooth (at least locally) and our minimization problem can be interpreted as a constrained optimization problem. We are looking for critical points of \(\mathcal {J}\) on the submanifold \(\{u \in \mathcal {U} : E^{t}(u) = q_{1}\}\).

Definition 2

We say that a normal extremal λ(t) with associated control \(\tilde {u}(t)\) is strictly normal if the differential of the endpoint map at \(\tilde {u}\) is surjective.

It makes sense to go on and consider higher order optimality conditions. At critical points is well defined (i.e. independent of coordinates) the Hessian of \(\mathcal J\) (or the second variation). Using chronological calculus (see again [7] or [1]) it is possible to write the second variation of \(\mathcal {J}\) on \(\ker dE^{t} \subseteq L^{\infty }([0,1],\mathbb {R}^{k})\).

Proposition 6 (Second variation)

Suppose that \((\lambda (t),\tilde {u})\) is a strictly normal critical point of \(\mathcal {J}\) with fixed initial and final point. For any \(u \in L^{\infty }([0,1],\mathbb {R}^{k})\) such that \({{\int \limits }_{0}^{1}}X_{t}u_{t} dt =0\) the second variation of \(\mathcal J\) has the following expression:

$$ d^{2}_{\tilde{u}}\mathcal{J}(u) = -{{\int}_{0}^{1}} \langle H_{t}u_{t}, u_{t}\rangle dt - {{\int}_{0}^{1}}{{\int}_{0}^{t}} \sigma (Z_{\tau} u_{\tau} ,Z_{t} u_{t} ) d\tau dt $$

The associated bilinear form is symmetric provided that u,v lie in a subspace that projects to a Lagrangian one via the map \(u \mapsto {{\int \limits }_{0}^{1}} Z_{t} u_{t} dt\).

$$ d^{2}_{\tilde{u}}\mathcal{J}(u,v) = -{{\int}_{0}^{1}}\langle H_{t}u_{t}, v_{t}\rangle dt - {{\int}_{0}^{1}}{{\int}_{0}^{t}} \sigma (Z_{\tau} u_{\tau} ,Z_{t} v_{t} ) d\tau dt $$

One often makes the assumption, which is customarily called strong Legendre condition, that the matrix Ht is strictly negative definite and has uniformly bounded inverse. This guarantees that the term:

$$ {{\int}_{0}^{1}}-\langle H_{t}u_{t},v_{t}\rangle dt $$

is equivalent to the L2 scalar product.

Definition 3

Suppose that the set \(U\subset \mathbb {R}^{k}\) is open, we say that \((\lambda (t),\tilde {u})\) is a regular critical point if strong Legendre condition holds along the extremal. If Ht ≤ 0 but \((\lambda (t),\tilde {u})\) does not satisfy Legendre strong condition, we say that \((\lambda (t),\tilde {u})\) is singular. If Ht ≡ 0 we say that it is totally singular.

Even if the extremal \((\lambda (t),\tilde {u})\) is abnormal or not strictly normal it is possible to produce a second variation for the optimal control problem. To do so, one considers the extended control system:

$$ \hat{f}_{(v,u)}(q) = \left( \begin{array}{c} \varphi(u,q)+v \\ f_{u}(q) \end{array}\right) \in \mathbb{R}\times T_{q}M $$

and the corresponding endpoint map \(\hat E^{t}: (0,+\infty ) \times \mathcal U_{q_{0}} \to \mathbb {R}\times M\). To differentiate it, we use the same construction explained above and employ the following Hamiltonians on \(\mathbb {R}^{*}\times T^{*}M\):

$$ \hat{h}_{(v,u)} (\nu,\lambda)= \langle \lambda,f_{u}\rangle +\nu(\varphi(u,q)+v) $$

One has just to identify which are the right controls to consider, PMP implies that \(\dot \nu =0\), ν ≤ 0 and v = 0. In the end, one obtains formally the same expression as in Proposition 6 involving the derivatives of the functions \(\hat h_{(v,u)}\) and recover the same expression as in Proposition 6 for strictly normal extremals (see [7, Chapter 20] or [8]).

5.3 Reformulation of the Main Results

In this section, we reformulate Theorem 2 as a characterization of the compact part of the second variation of an optimal control problem at a strictly normal regular extremal (see Definitions 2 and 3).

Theorem 3

Suppose \(\mathcal {V}\subset L^{2}([0,1],\mathbb {R}^{k})\) is a finite codimension subspace and K and operator satisfying (1) and (2). Then, \((K,\mathcal {V})\) can be realized as the second variation of an optimal control problem at a strictly normal regular extremal. To any such couple, we can associate a triple ((Σ,σ),π,Z) consisting of:

  • a finite dimensional symplectic space (Σ,σ);

  • a Lagrangian subspace π ⊂Σ;

  • a linear map \(Z: L^{2}([0,1],\mathbb {R}^{k}) \to {\Sigma }\) such that Im(Z) is transversal to the subspace π.

This triple is unique up to the action of stabπ(Σ,σ), the group of symplectic transformations that fix π. Any other triple is given by ((Σ,σ),π,Φ ∘ Z) for Φ ∈stabπ(Σ,σ).

Vice versa any triple ((Σ,σ),π,Z) as above determines a couple \((K,\mathcal {V})\). We can define the skew-symmetric part \(\mathcal {A}\) of K as:

$$ \langle \mathcal{A} u, v \rangle = \sigma (Zu,Zv), \forall u,v \in L^{2}([0,1],\mathbb{R}^{k}), $$

\(\mathcal {A}\) determines the whole operator K and its domain is recovered as \( \mathcal {V} = Z^{-1}({\Pi })\).

Proof

The proof is essentially a reformulation of Theorem 2. Given the operator we construct the symplectic space (Σ,σ) taking as vector space the image of the skew-symmetric part \(\text {Im}(\mathcal {A})\) and as symplectic form \(\langle \mathcal {A} \cdot , \cdot \rangle \).

The transversality condition correspond to the fact that the differential of the endpoint map is surjective.

The only thing left to show is uniqueness of the triple. Without loss of generality we can assume that the symplectic subspace \(({\Sigma }, \sigma ) = (\mathbb {R}^{2n},\sigma )\) is the standard one and that the Lagrangian subspace π is the vertical subspace. In this coordinates

$$Z(v) = {{\int}_{0}^{1}}Z_{t} v_{t} dt = {{\int}_{0}^{1}} \left( \begin{array}{c} Y_{t} \\ X_{t} \end{array}\right) v_{t} dt.$$

Define the following map:

$$F: L^{2}([0,1],\text{Mat}_{n\times k}(\mathbb{R})) \to L^{2}([0,1]^{2},\text{Mat}_{k\times k}(\mathbb{R})), \quad Y_{t} \mapsto Z_{t}^{*}JZ_{\tau} = X_{t}^{*}Y_{\tau}-Y_{t}^{*}X_{\tau}.$$

It is linear if Xt is fixed. To determine uniqueness, we have to study an affine equation thus is sufficient to study the kernel of F. Suppose for simplicity that Xt and Yt are continuous in t. We have to solve the equation:

$$ F(Y_{t}) = Z_{t}^{*}JZ_{\tau} =\sigma (Z_{t} ,Z_{\tau}) = 0. $$

Consider the following subspace of \(\mathbb {R}^{2n}\)

$$ V^{[0,1]} = \left\{{\sum}_{i=1}^{l} Z_{t_{i}}\nu_{i} : \nu_{i} \in \mathbb{R}^{k}, t_{i} \in[0,1], l \in \mathbb{N}\right\} \subset \mathbb{R}^{2n} $$

It follows that F(Yt) = 0 if and only if the subspace V[0,1] is isotropic. Since we are in finite dimension, we can consider a finite number of instants ti to which we can restrict to generate the whole V[0,1]. Call I the set of this instants. Without loss of generality we can assume that \(\left \{{\sum }_{i \in I}X_{t_{i}}\nu _{i}, \nu _{i} \in \mathbb {R}^{k}, t_{i} \in I\right \}=\mathbb {R}^{n}\).

This is so since the image of Z is transversal to π and thus \({\Gamma } = {{\int \limits }_{0}^{1}} X_{t}X_{t}^{*} dt \) is non-degenerate. In fact, if the subspace \(\left \{{\sum }_{i=1}^{l} X_{t_{i}}\nu _{i} \vert \nu _{i} \in \mathbb {R}^{k}, l \in \mathbb {N}\right \}\) were a proper subspace of \(\mathbb {R}^{n}\), there would be a vector μ such that 〈μ,Xtν〉 = 0, ∀t ∈ [0,1] and \(\forall \nu \in \mathbb {R}^{n}\). Thus, an element of the kernel of Γ. A contradiction.

Now, we evaluate the equation \(F(Y_{t}) =0 \iff Y_{t}^{*} X_{\tau }=X_{t}^{*}Y_{\tau } \) at the instants t = ti that guarantee controllability. One can read off the following identities:

$$ Y_{t}^{*} v_{j} = X_{t}^{*} c_{j} $$

where the \(v_{j}^{\prime }\)s are a base of \(\mathbb {R}^{n}\) and cj free parameters. Taking transpose we get that Yt = GXt.

It is straightforward to check that, if Yt = GXt, G must be symmetric, in fact:

$$ Z_{t}JZ_{\tau} =Y_{t}^{*}X_{\tau}-X_{t}^{*} Y_{\tau}=X_{t}^{*}(G^{*}-G)X_{\tau} = 0 \iff G = G^{*} $$

And so uniqueness is proved when Xt and Yt are continuous.

The case in which Xt and Yt are just L2 (matrix-)functions can be dealt with similarly. One has just to replace evaluations with integrals of the form \({\int \limits }_{t-\epsilon }^{t+\epsilon }Z_{\tau } \nu d\tau \) and \({\int \limits }_{t-\epsilon }^{t+\epsilon } X_{\tau } \nu d\tau \) and interpret every equality t almost everywhere.

The only thing left to show is how to construct a control system with given \((K,\mathcal {V})\) as second variation. By the equivalence stated above it is enough to show that we can realize any given map \(Z : L^{2}([0,1],\mathbb {R}^{k})\to {\Sigma }\) with a proper control system. We can assume without loss of generality that (Σ,σ) is just \(\mathbb {R}^{2m}\) with the standard symplectic form and π is the vertical subspace. With this choices the map Z is given by :

$$ v \mapsto {{\int}_{0}^{1}} Z_{t} v_{t} dt = {{\int}_{0}^{1}} \left( \begin{array}{c} Y_{t} v_{t} \\X_{t} v_{t} \end{array}\right) dt $$

The operator K is then given by \(K(v) = {{\int \limits }_{0}^{t}} Z_{t}^{*}J Z_{\tau } v_{\tau } d \tau \) and \(\mathcal {V} = \left \{v \vert {{\int \limits }_{0}^{1}} X_{t} v_{t} dt =0\right \}\). Consider the following linear quadratic system on \(\mathbb {R}^{m}\):

$$ f_{u}(q) = B_{t} u \quad \varphi_{t}(x) =\frac{1}{2}\vert u \vert^{2}+ \langle {\Omega}_{t} u,x \rangle, $$

where Bt and Ωt are matrices of size m × k, the Hamiltonian in PMP reads:

$$ h_{u}(\lambda,x) = \langle \lambda, B_{t} u \rangle -\frac{1}{2}\vert u \vert^{2} -\langle {\Omega}_{t} u,x \rangle $$

Take as extremal control ut ≡ 0, it easy to check that the re-parametrization flow \(\tilde {\Phi }_{t}\) defined in (23) is just the identity and the matrix Zt for this problem is the following:

$$ Z_{t} = \left( \begin{array}{c} {\Omega}_{t} \\B_{t} \end{array}\right) $$

So it is enough to take Ωt = Yt and Bt = Xt. □

We can reformulate also the second part of Theorem 2 relating the capacity of K and the eigenvalues of \(\mathcal {A}\). We make the following assumptions:

  1. 1.

    the map tZt is piecewise analytic in t;

  2. 2.

    the maximum condition in the statement of PMP defines a C2 function \( \hat {H}_{t}(\lambda ) = \max \limits _{u\in \mathbb R^{k}} {h^{t}_{u}}(\lambda )\) in a neighbourhood of the strictly normal regular extremal we are considering.

Under the above assumptions the following proposition clarifies the link between the matrices Zt and Ht and the function \(\hat {H}_{t}\). A proof can be found either in [7, Proposition 21.3] or [1].

Proposition 7

Suppose that \((\lambda (t),\tilde {u})\) is an extremal and the function \(\hat H_{t}\) is C2, using the flow defined in (23) define \({\mathscr{H}}_{t}(\lambda ) = (\hat H_{t}-h_{\tilde u(t)})\circ \tilde {\Phi }_{t}(\lambda ) \). It holds that:

$$ \text{Hess}_{\lambda_{0}}(\mathcal H_{t}) = JZ_{t}H_{t}^{-1}Z_{t}^{*}J $$

Define \(R_{t} = \max \limits _{v \in \mathbb {R}^{k},\vert \vert v\vert \vert =1} \vert \vert Z_{t} v\vert \vert \) and let \(\{\pm i\zeta _{j}(t)\}_{j=1}^{l}\) be the eigenvalues of \(iZ_{t}^{*}JZ_{t}\) as defined in Section 4. We have the following proposition.

Proposition 8

The capacity ξ of K satisfies:

$$ \xi \le \frac{\sqrt{k} \vert\vert R_{t}\vert\vert_{2}}{2} \sqrt{{{\int}_{0}^{1}} \text{tr}(\text{Hess}_{\lambda_{0}}(\mathcal H_{t})) dt} $$

and in particular, if we arrange the functions ζj(t) in a decreasing order, they satisfy

$$ 0 \le \zeta_{j}(t) \le R_{t} \sqrt{\lambda_{2j}(t)}, \quad j \in \{1,{\dots} l\} $$

where λj(t) are the eigenvalues of \(Hess_{\lambda _{0}}(\mathcal H_{t})\) in decreasing order.

Proof

We give a sketch of the proof. Without loss of generality we can assume Ht = −Id, otherwise, we can perform the change of coordinate on \(L^{2}([0,1],\mathbb {R}^{k})\) \(v \mapsto (-H_{t})^{-\frac {1}{2}}v\) and redefine Zt accordingly.

In this notation \(Hess_{\lambda _{0}}({\mathscr{H}}_{t})\) corresponds to the matrix \(JZ_{t}Z_{t}^{*}J\). If we square \(A_{t} = Z_{t}^{*}JZ_{t}\) we obtain:

$$ A_{t}^{*}A_{t} = -Z_{t}^{*}JZ_{t}Z_{t}^{*}JZ_{t} = - Z_{t}^{*}\left( J Z_{t}Z_{t}^{*} J\right)Z_{t} = -Z_{t}^{*}Hess_{\lambda_{0}}(\mathcal{H}_{t})Z_{t} $$

Observe that ζj(t) is an eigenvalue of At if and only if \(-{\zeta _{j}^{2}}(t)\) is a eigenvalue of \(A^{*}_{t}A_{t}\). The equation above relates the restriction of \(Hess_{\lambda _{0}}({\mathscr{H}}_{t})\) to the image of the maps \(Z_{t}: \mathbb {R}^{k}\to \mathbb {R}^{2n}\) with the square of the functions ζj(t) defining the capacity.

The idea is to use Cauchy interlacing inequality for the eigenvalues of \(Hess_{\lambda _{0}}({\mathscr{H}}_{t})\) and its restriction to a codimension 2nk subspace. If \(\{\lambda _{j}(t)\}_{j=1}^{2n}\) are the eigenvalues of the Hessian, taken in decreasing order, and \(\{\mu _{j}(t)\}_{j=1}^{2n-k}\) the eigenvalues of its restriction we have:

$$\lambda_{j+2n-k}(t) \le \mu_{j}(t) \le \lambda_{j}(t) $$

In our case, Zt are not orthogonal projectors but we can adjust the estimates considering how much the matrices Zt dilate the space, and thus we have to take in account the function Rt defined just before the statement. Denote by μj(t) the j th eigenvalue of \(-{A_{t}^{2}},\) putting all together we have:

$$0 \le \mu_{j}(t) \le {R_{t}^{2}} \lambda_{2j}(t)\quad j \in \{1,{\dots} k\}$$

where we shifted the index by one since μ2k− 1(t) = μ2k(t) for all kl. Taking square roots and integrating we have:

$${{\int}_{0}^{1}}\zeta_{j}(t)dt\le {{\int}_{0}^{1}}R_{t} \sqrt{\lambda_{2j}(t)}dt $$

Summing up over j we find that:

$$\xi = {{\int}_{0}^{1}}{\sum}_{j} \zeta_{j}(t) dt \le \frac{1}{2}{{\int}_{0}^{1}} {\sum}_{j} R_{t}\sqrt{\lambda_{2j}(t)}dt \le \frac{\sqrt{k}\vert \vert R_{t} \vert \vert_{2}}{2} \sqrt{{{\int}_{0}^{1}} \text{tr} (Hess_{\lambda_{0}} (\mathcal{H}_{t}))}$$

We turn now to Theorem 1; we can interpret it as a quantitative version of various necessary optimality conditions that one can formulate for certain classes of singular extremals (see [7, Chapter 20] or [4, Chapter 12]). Moreover, leaving optimality conditions aside, Theorem 1 gives the asymptotic distribution of the eigenvalues of the second variation for totally singular extremals (see definition 3).

As mentioned in the previous section, we can produce a second variation also in the non-strictly normal case which is at least formally very similar to the normal case. However, a common occurrence is that the matrix Ht completely degenerates and is constantly equal to the zero matrix. This is the case for affine control systems and abnormal extremal in Sub-Riemannian geometry, i.e. systems of the form:

$$ f_{u} = {\sum}_{i=1}^{l} f_{i} u_{i} +f_{0}, \quad f_{i} \text{ smooth vector fields} $$

In this case, Legendre condition Ht ≤ 0 (see the previous section) does not give much information. One, then, looks for higher order optimality conditions. This is usually done exactly as in Lemma 1: the first optimality conditions one finds are Goh condition and generalized Legendre condition which prevent the second variation from being strongly indefinite.

In the notation of Lemma 1, Goh conditions are written as Q1 ≡ 0 i.e. \(Z_{t}^{*}JZ_{t} \equiv 0\). It can be reformulated in geometric terms as follows, if λt is the extremal then

$$ \lambda_{t} [\partial_{u} f_{u}(q(t))v_{1},\partial_{u} f_{u}(q(t))v_{2}] =0, \forall v_{1},v_{2} \in \mathbb{R}^{k} $$

From Theorem 1, it is clear that if Q1≢0, the second variation has infinite negative index and that eigenvalues distribute evenly between the negative and positive parts of the spectrum. Then, one asks that the second term Q2 is non-positive definite (recall the different sign convention in Proposition 6); otherwise, the negative part of the spectrum of − Q2 becomes infinite. In our notation, this condition reads

$$ (Z_{t}^{(1)})^{*}JZ_{t} \le 0 \iff \sigma (Z_{t}^{(1)} v, Z_{t} v) \le 0, \forall v \in \mathbb{R}^{k}. $$

Again, it can be translated in a differential condition along the extremal; however, this time, it will in general involve more than just commutators if the system is not control affine.

If Q2 ≡ 0, one can take more derivatives and find new conditions. In particular, using the notation of Lemma 1, one has always to ask that the first non-zero term in the expansion is of even order and that the matrix of its coefficients is non-positive in order to have finite negative index.