1 Introduction

The study and elaboration of the Bounded Real Lemma (BRL) has a rich history, beginning with the work of Kalman [20], Yakubovich [35] and of Popov [25]. From the beginning, the Kalman-Yakubovich-Popov (KYP) lemma was viewed more broadly as the quest to establish the equivalence between a frequency-domain inequality (FDI) and a Linear Matrix Inequality (LMI). In our case, this will actually be a Linear Operator Inequality.

A finite dimensional, linear input-output system in continuous time is frequently written in input/state/output form

$$\begin{aligned} \Sigma :\quad \begin{bmatrix}{\dot{{{\mathbf {x}}}}}(t)\\ {{\mathbf {y}}}(t)\end{bmatrix} = \begin{bmatrix}A&{}B\\ C&{}D\end{bmatrix} \begin{bmatrix}{{\mathbf {x}}}(t)\\ {{\mathbf {u}}}(t)\end{bmatrix}, \quad t\geqslant 0,\quad {{\mathbf {x}}}(0)=x_0, \end{aligned}$$
(1.1)

where the state \({{\mathbf {x}}}(t)\) at time t takes values in the state space \(X={{\mathbb {C}}}^n\) (with \({{\mathbb {C}}}\) denoting the set of complex numbers), the input \({{\mathbf {u}}}(t)\) lives in the input space \(U={{\mathbb {C}}}^m\), and the output \({{\mathbf {y}}}(t)\) in the output space \(Y={{\mathbb {C}}}^k\), and where A, B, C, D are matrices of appropriate sizes. The initial time is \(t=0\) and \(x_0\in X\) is the given initial state of the system. By the elementary theory of differential equations, the unique solution of (1.1) is

$$\begin{aligned} \left\{ \begin{aligned} {{\mathbf {x}}}(t)&= e^{At} x_0 + \int _0^t e^{A(t-s)} B {{\mathbf {u}}}(s) \,{\mathrm {d}}s, \\ {{\mathbf {y}}}(t)&= C e^{At} x_0 +\int _0^t C e^{A(t-s)} B {{\mathbf {u}}}(s) \,{\mathrm {d}}s + Du(t). \end{aligned} \right. \end{aligned}$$
(1.2)

Taking Laplace transforms in (1.2), we get

$$\begin{aligned} \left\{ \begin{aligned} {\widehat{{{\mathbf {x}}}}}(\lambda )&= (\lambda -A)^{-1} x_0 + (\lambda - A)^{-1}B{\widehat{{{\mathbf {u}}}}}(\lambda ), \\ {\widehat{{{\mathbf {y}}}}}(\lambda )&= C (\lambda -A)^{-1} x_0 + {\widehat{{\mathfrak {D}}}}(\lambda ){{\mathbf {u}}}(\lambda ), \end{aligned} \right. \end{aligned}$$

where

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}(\lambda ) = C (\lambda - A)^{-1} B+D \end{aligned}$$
(1.3)

is called the transfer function of the linear system (1.1). In particular, when \(x_0=0\), we get

$$\begin{aligned} {\widehat{{{\mathbf {y}}}}}(\lambda ) = {\widehat{{\mathfrak {D}}}}(\lambda ) {\widehat{{{\mathbf {u}}}}}(\lambda ), \end{aligned}$$
(1.4)

i.e., the transfer function maps the Laplace transform of the input signal into the Laplace transform of the output signal. Alternatrively, let us make the Ansatz that \({{\mathbf {u}}}(t)=e^{\lambda t}u_0\), \({{\mathbf {x}}}(t)=e^{\lambda t}x_0\) and \({{\mathbf {y}}}(t)=e^{\lambda t}y_0\) form a trajectory on \({{\mathbb {R}}}\), where \(u_0\), \(x_0\) and \(y_0\) are constant vectors. Then \({\dot{{{\mathbf {x}}}}}(t)=\lambda e^{\lambda t}x_0\) and the first equation in (1.1) gives \(x_0=(\lambda -A)^{-1}Bu_0\). Plug this into the second equation of (1.1) to get \(y_0={\widehat{{\mathfrak {D}}}}(\lambda )u_0\). Hence, the transfer function maps the amplitude of the input wave to the amplitude of the output wave, and this gives a second interpretation of the transfer function as a frequency response function. This second interpretation can be extended to time-varying linear systems as well; see [8]. For finite dimensional systems, the Laplace transform version is more common, but for infinite-dimensional systems, the frequency response version is more accessible.

We will be particularly interested in the case where \({\widehat{{\mathfrak {D}}}}(\lambda )\) is analytic on the right half-plane \({{\mathbb {C}}}^+\). If it is the case that in addition \(\Vert {\widehat{{\mathfrak {D}}}}(\lambda ) \Vert \le 1\) for all \(\lambda \) in the open right half-plane \({{\mathbb {C}}}^+\), we say that \({\widehat{{\mathfrak {D}}}}\) is in the Schur class (with respect to \({{\mathbb {C}}}^+\)), denoted as \({{\mathcal {S}}}_{U,Y}\).

What we shall call the standard bounded real lemma (standard BRL) is concerned with characterizing in terms of the system matrix \(\left[ {\begin{matrix} A &{} B \\ C &{} D \end{matrix}}\right] \) when it is the case that the associated transfer function \({\widehat{{\mathfrak {D}}}}(\lambda )\) is in \({{\mathcal {S}}}_{U,Y}\). A variation of the problem is the strict bounded real lemma which is concerned with the problem of characterizing in terms of the system matrix \(\left[ {\begin{matrix}A &{} B \\ C &{} D \end{matrix}}\right] \) when the associated transfer function \({\widehat{{\mathfrak {D}}}}(\lambda )\) is in the strict Schur class \({{\mathcal {S}}}^0_{U, Y}\), i.e., when there exists a \(\rho <1\) such that \(\Vert {\widehat{{\mathfrak {D}}}}(\lambda ) \Vert \le \rho \) for all \(\lambda \in {{\mathbb {C}}}^+\). For the finite dimensional case, the problem is pretty well understood (see [7, 33] for the standard case and [24] for the strict case), while for the infinite dimensional case the results are not as complete, but see [5] for the standard case). Our goal here is to provide a unified approach to the standard and the strict bounded real lemmas for infinite dimensional well-posed system with continuous time (as in [31]); in fact, at that level of generality, this appears to be the first attempt at a strict bounded real lemma.

We shall make use of the concept of storage function as introduced by J. Willems in his study of dissipative systems [33, 34], closely related to independent work [4] of D. Arov appearing around the same time. Here we concentrate on the special case of “scattering” supply rate: \(s(u,y) = \Vert u \Vert ^2 - \Vert y \Vert ^2\).

Definition 1.1

The function \(S:X\rightarrow [0,\infty ]\) is a storage function for \(\Sigma \) if \(S(0)=0\) and for all trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) with initial time 0 and for all \(t>0\), it holds that

$$\begin{aligned} S\left( {{\mathbf {x}}}(t)\right) +\int _0^t\Vert {{\mathbf {y}}}(s)\Vert _Y^2\,{\mathrm {d}}s \leqslant S\left( {{\mathbf {x}}}(0)\right) + \int _0^t\Vert {{\mathbf {u}}}(s)\Vert _U^2\,{\mathrm {d}}s. \end{aligned}$$
(1.5)

If \(S(x)=\Vert x\Vert _X^2\) is a storage function for \(\Sigma \), then \(\Sigma \) is called passive.

An easy consequence of this notion of dissipativity (i.e., existence of a storage function) is what we shall call input/output dissipativity, namely: In case the system is initialized with the initial state \(x_0\) set equal to 0, then the energy drained out of the system over the interval [0, t] via the output \({{\mathbf {y}}}\) cannot exceed the energy inserted into the system over the same interval via the input \({{\mathbf {u}}}\): that is,

$$\begin{aligned} \int _0^t\Vert {{\mathbf {y}}}(s)\Vert _Y^2\,{\mathrm {d}}s \leqslant \int _0^t\Vert {{\mathbf {u}}}(s)\Vert _U^2\,{\mathrm {d}}s,\quad \text {subject to }x_0=0. \end{aligned}$$

This implies that the transfer function is in the Schur class; more details on this can be found in Proposition 6.1 below. A non-obvious point is that the converse holds: if \({\widehat{{\mathfrak {D}}}} \in {{\mathcal {S}}}_{U,Y}\), then a storage function exists for \(\Sigma \), and this will be one of the statements in our standard BRL. Similarly, as we shall see that \({\widehat{{\mathfrak {D}}}}\) being in the strict Schur class is equivalent to \(\Sigma \) having what we shall call a strict storage function (see Definition 1.4 below).

For a suitable function \({{\mathbf {u}}}\), let \(\tau ^t\) denote the backward-shift operator

$$\begin{aligned} (\tau ^t {{\mathbf {u}}})(s) = {{\mathbf {u}}}(t+s), \quad t \in {{\mathbb {R}}}, \, t+s \in {\text {dom}}({{\mathbf {u}}}). \end{aligned}$$

By time-invariance of the system equations (1.1) we see that for any \(t_0 > 0\) the backward-shifted trajectory \((\tau ^{t_0}{{\mathbf {u}}}, \tau ^{t_0}{{\mathbf {x}}}, \tau ^{t_0}{{\mathbf {y}}})\) is again a system trajectory whenever \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) is a system trajectory. Setting \(t_1 = t_0 > 0\), \(t_2 = t + t_0>t_1\) and rewriting the resulting version of (1.5) as

$$\begin{aligned} S({{\mathbf {x}}}(t_2)) - S({{\mathbf {x}}}(t_1)) \le \int _{t_1}^{t_2} \Vert {{\mathbf {u}}}(s)\Vert _U^2\,{\mathrm {d}}s - \int _{t_1}^{t_2} \Vert {{\mathbf {y}}}(s)\Vert _Y^2\,{\mathrm {d}}s, \end{aligned}$$

we see that the dissipation inequality (1.5) can be interpreted as saying that the net energy stored by the system state over the interval \([t_1, t_2]\) is no more than the net energy supplied to the system by the outside environment over the same time interval.

In order to state the standard and strict bounded real lemmas even for the finite dimensional case, we need to carefully distinguish different notions of positivity for Hermitian matrices.

Definition 1.2

For H an \(n \times n\) Hermitian matrix over \({{\mathbb {C}}}\), we write

  • \(H \succ 0\) if \(\langle H x, x \rangle > 0\) for all nonzero x in \({{\mathbb {C}}}^{n \times n}\) (equivalently for the finite dimensional case here, for some \(\delta > 0\) we have \(\langle H x, x \rangle \ge \delta \Vert x \Vert ^2\) for all \(x \in {{\mathbb {C}}}^n\)),

  • \(H \prec 0\) if \(-H \succ 0\),

  • \(H \succeq 0\) if \(\langle H x, x \rangle \ge 0\) for all \(x \in {{\mathbb {C}}}^n\),

  • \(H \preceq 0\) if \(-H \succeq 0\).

Theorem 1.3

(Standard finite dimensional bounded real lemma; see e.g. [7, 33]) For a finite-dimensional linear system \(\Sigma \) with system matrix \({{\mathbf {S}}}= \left[ {\begin{matrix} A &{} B \\ C &{} D \end{matrix}}\right] \) as in (1.1) which is minimal (i.e., \(\text{ rank }\, [B\ AB\ \cdots \ A^{n-1}B]=n\) (controllability) and \(\text{ rank }\, [C^*\ A^*C^* \ \cdots \ A^{*n-1}C^*]=n\) (observability), the following conditions are equivalent:

  1. (1)

    After unique analytic continuation (if necessary) to a domain \({{\mathcal {D}}}({\widehat{{\mathfrak {D}}}}) \supset {{\mathbb {C}}}^+\), \({\widehat{{\mathfrak {D}}}}\) is in the Schur class \({{\mathcal {S}}}_{U,Y}\).

  2. (2)

    The following continuous-time Kalman-Yakubovich-Popov (KYP) inequality has a solution \(H\succ 0\):

    $$\begin{aligned} \begin{bmatrix}HA+A^*H+C^*C &{} HB+C^*D \\ B^*H+D^*C &{} D^*D-I\end{bmatrix} \preceq 0. \end{aligned}$$
    (1.6)
  3. (3)

    The system \(\Sigma \) is similar to a passive system \(\Sigma ^\circ \), i.e., there exist \(X^\circ \) and an invertible \(\Gamma :X \rightarrow X^\circ \) such that

    $$\begin{aligned} \begin{bmatrix}A^\circ &{}B^\circ \\ C^\circ &{}D^\circ \end{bmatrix}: = \begin{bmatrix}\Gamma &{}0\\ 0&{}I\end{bmatrix}\begin{bmatrix}A&{}B\\ C&{}D\end{bmatrix} \begin{bmatrix}\Gamma ^{-1}&{}0\\ 0&{}I\end{bmatrix} \end{aligned}$$
    (1.7)

    satisfies (1.6) with \(H = 1_{X^\circ }\).

  4. (4)

    The system \(\Sigma \) has a storage function.

  5. (5)

    The system \(\Sigma \) has a quadratic storage function (see below).

Here by a quadratic storage function we mean a storage function S of the special form \(S(x) = \langle H x, x \rangle \), where \(H \succeq 0\) is a Hermitian matrix. If H is positive definite (\(H \succ 0\)) then \(S = S_H\) has the additional property that S is coercive (there is a \(\delta > 0\) so that \(S_H(x) \ge \delta \Vert x \Vert ^2\) for all \(x \in X\)). The connection between a solution \(H \succeq 0\) of the KYP-inequality (1.6) and a quadratic storage function is that any \(H \succeq 0\) satisfying (1.6) generates a quadratic storage function S for \(\Sigma \) according to \(S(x) = S_H(x):= \langle H x, x \rangle \). The strict bounded real lemma is concerned with an analogous characterization of the strict Schur class \({{\mathcal {S}}}^0_{U,Y}\).

To formulate the strict result let us introduce the following terminology.

Definition 1.4

Suppose \(S :X \rightarrow [0, \infty ]\) is such that \(S(0) = 0\) and \(\Sigma \) is a well-posed linear system with system trajectories \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) with initiation at \(t = 0\). Then we say that:

  1. (1)

    S is a strict storage function for \(\Sigma \) if there is a \(\delta > 0\) so that, for all system trajectories \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) of \(\Sigma \) and \(0 \le t_1 < t_2\) we have

    $$\begin{aligned}&S({{\mathbf {x}}}(t_2)) + \delta \int _{t_1}^{t_2} \Vert {{\mathbf {x}}}(s) \Vert ^2 \,{\mathrm {d}}s + \int _{t_1}^{t_2} \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \nonumber \\&\quad \le S({{\mathbf {x}}}(t_1)) + (1 - \delta ) \int _{t_1}^{t_2} \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s. \end{aligned}$$
    (1.8)
  2. (2)

    S is a semi-strict storage function for \(\Sigma \) if condition (1.8) holds but with the integral term involving the state vector \({{\mathbf {x}}}(s)\) omitted, i.e., if there is a \(\delta > 0\) so that, for all system trajectories \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) and \(0 \le t_1 < t_2\) we have

    $$\begin{aligned} S({{\mathbf {x}}}(t_2)) + \int _{t_1}^{t_2} \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \le S({{\mathbf {x}}}(t_1)) + (1 - \delta ) \int _{t_1}^{t_2} \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s. \end{aligned}$$
    (1.9)

In the following result the equivalence (1) \(\Leftrightarrow \) (2) is due to Petersen-Anderson-Jonckheere [24] (at least for the special case \(D = 0\)); we add the connections with similarity and storage functions for the strict setting.

Theorem 1.5

(Finite dimensional strict bounded real lemma) Suppose that \(\Sigma \) is a finite dimensional linear system with system matrix \({{\mathbf {S}}}= \left[ {\begin{matrix} A &{} B \\ C &{} D \end{matrix}}\right] \) as in (1.1) such that the matrix A is stable (i.e., A has spectrum only in the open left half plane: \(\sigma (A) \subset {{\mathbb {C}}}^-:=\{\lambda \in {{\mathbb {C}}}\mid {\text {Re}}\,(\lambda )<0\}\)). Then the following conditions are equivalent:

  1. (1)

    Possibly after unique analytic continuation to a domain \({\text {dom}}({\widehat{{\mathfrak {D}}}}) \supset {{\mathbb {C}}}^+\), \({\widehat{{\mathfrak {D}}}}\) is in the strict Schur class \({{\mathcal {S}}}^0_{U,Y}\).

  2. (2)

    The following continuous-time strict Kalman-Yakubovich-Popov (KYP) inequality has a solution \(H\succ 0\):

    $$\begin{aligned} \begin{bmatrix}HA+A^*H+C^*C &{} HB+C^*D \\ B^*H+D^*C &{} D^*D-I\end{bmatrix} \prec 0. \end{aligned}$$
    (1.10)
  3. (3)

    The system \(\Sigma \) is similar to a strictly passive system \(\Sigma ^\circ \), i.e., there exist \(X^\circ \) and an invertible \(\Gamma :X \rightarrow X^\circ \) such that (1.7) satisfies (1.10) with \(H = 1_{X^\circ }\).

  4. (4)

    The system \(\Sigma \) has a quadratic, coercive strict storage function.

  5. (5)

    The system \(\Sigma \) has a semi-strict storage function.

In the infinite dimensional case, we wish to allow one or each of the coefficient spaces, i.e., the input space U, the state space X, or the output space Y, to be a infinite dimensional Hilbert space. The situation becomes more involved in at least three respects:

  • The system matrix \(\left[ {\begin{matrix}A&{}B\\ C&{}D \end{matrix}}\right] \) is replaced by an (in general) unbounded system node (see [31,  Definition 4.7.2], [5, §2] or §4 below for details) between Hilbert spaces U, X and Y. Here we restrict ourselves to the setting of well-posed systems, i.e., in place of the system matrix \(\left[ {\begin{matrix} A &{} B \\ C &{} D \end{matrix}}\right] \) as in (1.1) there is a well-defined one-parameter family of block \(2 \times 2\) operator matrices

    $$\begin{aligned} \begin{bmatrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{bmatrix} :\begin{bmatrix} X \\ L^2([0,t], U) \end{bmatrix} \rightarrow \begin{bmatrix} X \\ L^2([0,t], Y) \end{bmatrix}, \quad t > 0, \end{aligned}$$

    which corresponds to the mapping such that

    $$\begin{aligned} \begin{bmatrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{bmatrix} :\begin{bmatrix} {{\mathbf {x}}}(0) \\ \pi _{[0,t]} {{\mathbf {u}}}\end{bmatrix} \rightarrow \begin{bmatrix} {{\mathbf {x}}}(t) \\ \pi _{[0,t]} {{\mathbf {y}}}\end{bmatrix}, \quad t > 0, \end{aligned}$$

    whenever \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) is a system trajectory. It is often advantageous to work with the "integrated operators" \({\mathfrak {A}}^t\), \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\), \({\mathfrak {D}}^t\) instead of with the system node directly. In case the system is finite dimensional and given by system matrix \(\left[ {\begin{matrix} A &{} B \\ C &{} D \end{matrix}}\right] \), one can read off from (1.2) that the integrated operators \({\mathfrak {A}}^t\), \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\), \({\mathfrak {D}}^t\) are given by

    $$\begin{aligned}&{\mathfrak {A}}^t :x_0 \mapsto e^{At} x_0,\\&{\mathfrak {B}}^t :{{\mathbf {u}}}|_{[0,t]} \mapsto \int _0^t e^{A(t - s)} B u(s) \,{\mathrm {d}}s,\quad {\mathfrak {C}}^t :x_0 \mapsto C e^{As} x_0|_{0 \le s \le t},\\&{\mathfrak {D}}^t :{{\mathbf {u}}}|_{[0,t]} \mapsto \bigg ( C \int _0^s e^{A(s-s')} B {{\mathbf {u}}}(s') \,{\mathrm {d}}s' + D {{\mathbf {u}}}(s) \bigg ) \bigg |_{0 \le s \le t}. \end{aligned}$$

    To get some additional flexibility with respect to choice of location \(t_0\) for the specification of the initial condition (\({{\mathbf {x}}}(t_0) = x_0\)), Staffans (see [31, page 30]) defines three “master operators"

    $$\begin{aligned} \begin{aligned}&{\mathfrak {B}}{{\mathbf {u}}}: = \int _{-\infty }^0 {\mathfrak {A}}^{-s} B {{\mathbf {u}}}(s) \,{\mathrm {d}}s, \quad {\mathfrak {C}}x: = \bigg ( t \mapsto C {\mathfrak {A}}^t x\bigg )_{t \ge 0}, \\&{\mathfrak {D}}{{\mathbf {u}}}: = \bigg ( t \mapsto \int _{-\infty }^t C {\mathfrak {A}}^{t-s} B {{\mathbf {u}}}(s) \,{\mathrm {d}}s + D {{\mathbf {u}}}(t) \bigg )_{t \in {{\mathbb {R}}}} \end{aligned} \end{aligned}$$
    (1.11)

    and observes that the analogues of \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\), \({\mathfrak {D}}^t\) for the case where the initial condition is taken at \(t = t_0\) rather than \(t=0\) (denoted as \({\mathfrak {B}}^t_{t_0}\), \({\mathfrak {C}}^t_{t_0}\), \({\mathfrak {D}}^t_{t_0}\)) are all easily expressed in terms of the master operators; for the case where \(t_0=0\) the formulas are as in Eq. (2.1) below.

    We let the collection of operators written in block matrix from (even though it does not fit as the representation of a single operator between a two-component input space and a two-component output space) \(\left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) denote the associated well-posed linear system.

  • Secondly, since the state space X may be infinite dimensional, the solution H of (1.6) can become unbounded, both from below and from above. In this case the notion of positivity for a (possibly unbounded) selfadjoint Hilbert-space operator becomes still more refined than that for the finite dimensional case (cf., Definition 1.2) as follows.

Definition 1.6

For an unbounded, densely defined, selfadjoint operator H on X with domain \({\text {dom}}(H)\) we say:

  1. (1)

    H is positive semidefinite (written \(H \succeq 0\)) when \(\left\langle Hx , x \right\rangle \geqslant 0\) for all \(x\in {\text {dom}}(H)\);

  2. (2)

    H is positive definite (written \(H \succ 0\)) whenever \(\left\langle Hx , x \right\rangle > 0\) for all \(0\ne x\in {\text {dom}}(H)\);

  3. (3)

    H is strictly positive definite (written ) whenever there exists a \(\delta >0\) so that \(\left\langle Hx , x \right\rangle \geqslant \delta \Vert x\Vert ^2\) for all \(0\ne x\in {\text {dom}}(H)\).

  • By [21,  Theorem 3.35 on p. 281], each positive semidefinite operator H on X admits a positive semidefinite square root \(H^\frac{1}{2}\), for which we have \(H=H^\frac{1}{2}H^\frac{1}{2}\), and hence

    $$\begin{aligned} {\text {dom}}(H)=\left\{ x\in {\text {dom}}(H^\frac{1}{2}) \bigm \vert H^\frac{1}{2} x \in {\text {dom}}(H^\frac{1}{2}) \right\} \subset {\text {dom}}(H^\frac{1}{2}). \end{aligned}$$

    Throughout this paper we use the standard ordering for possibly unbounded positive semidefinite operators (see, e.g., [2,  §5] or [21,  (2.17) on p. 330]): given positive semidefinite operators \(H_1\) and \(H_2\) on a Hilbert space X, we write \(H_1 \preceq H_2\) if

    $$\begin{aligned} {\text {dom}}(H_2^{\frac{1}{2}}) \subset {\text {dom}}(H_1^{\frac{1}{2}}) \quad \text{ and }\quad \Vert H_1^{\frac{1}{2}} x\Vert \le \Vert H_2^{\frac{1}{2}}x \Vert \ \ \text{ for } \text{ all } x \in {\text {dom}}(H_2^{\frac{1}{2}}). \end{aligned}$$

    In case \(H_2\) and \(H_1\) are bounded, this amounts to the standard Loewner ordering for bounded selfadjoint operators. Similarly we define \(H_1 \prec H_2\) and , and we write \(H_1 \succeq H_2\) (resp. \(H_1 \succ H_2\) and ) whenever \(H_2 \preceq H_1\) (resp. \(H_2 \prec H_1\) and ).

  • \(\bullet \) Thirdly, with all of A, B, C, D, being possibly unbounded, it is more difficult to make sense of the formula (1.3) for the transfer function of the system \(\Sigma \). However, there is a formula for the well-posed-system setup based on the interpretation of the transfer function as a “frequency response function" which appeared at the beginning of the introduction. There is also a formula for the transfer function analogous to formula (1.3) expressed directly in terms of the associated system node \({{\mathbf {S}}}\) (see the formula (4.4) to come). All these ideas are worked out in detail in Staffans’ book [31] and the fragments needed here are reviewed in §2 and §4 below.

In the case of unbounded positive semidefinite solutions H, the associated quadratic function \(S_H\) should be allowed to take on the value infinity according to the formula:

$$\begin{aligned} S_H(x) = {\left\{ \begin{array}{ll} \Vert H^{\frac{1}{2}}x \Vert ^2_X &{} \text {if } x \in {\text {dom}}(H^{\frac{1}{2}}), \\ \infty &{} \text {if } x \notin {\text {dom}}(H^{\frac{1}{2}}). \end{array}\right. } \end{aligned}$$

Remark 1.7

Note that then H being bounded is detected in the associated quadratic function \(S_H\) by \(S_H\) being finite-valued, while H being strictly positive definite (i.e., ) is detected in \(S_H\) by \(S_H\) being coercive, i.e., there is a \(\delta > 0\) so that \(S_H(x) \ge \delta \Vert x \Vert ^2\) for all \(x \in X\).

Also, for the case where H is unbounded, the similarity \(\Gamma \) should be weakened to a pseudo-similarity defined as follows.

Definition 1.8

Two well-posed systems \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) and \(\Sigma ^\circ =\left[ {\begin{matrix}{\mathfrak {A}}^\circ &{}{\mathfrak {B}}^\circ \\ {\mathfrak {C}}^\circ &{}{\mathfrak {D}}^\circ \end{matrix}}\right] \), with state spaces X and \(X^\circ \), respectively, are pseudo-similar if \({\mathfrak {D}}^\circ ={\mathfrak {D}}\) and there exists a closed, densely defined and injective linear operator \(\Gamma :X\supset {\text {dom}}(\Gamma )\rightarrow X^\circ \) with dense range, called a pseudo-similarity, with the following properties:

  1. (1)

    \({\text {ran}}({\mathfrak {B}})\subset {\text {dom}}(\Gamma )\) and \({\mathfrak {B}}^\circ =\Gamma {\mathfrak {B}}\), or equivalently \({\text {ran}}({\mathfrak {B}}^t) \subset {\text {dom}}(\Gamma )\) and \({\mathfrak {B}}^{\circ t} = \Gamma {\mathfrak {B}}^t\) for each t.

  2. (2)

    for all \(t\geqslant 0\), \({\mathfrak {A}}^t{\text {dom}}(\Gamma )\subset {\text {dom}}(\Gamma )\) and \({\mathfrak {A}}^{\circ t}\Gamma =\Gamma {\mathfrak {A}}^t\big |_{{\text {dom}}(\Gamma )}\), and

  3. (3)

    \({\mathfrak {C}}^\circ \Gamma ={\mathfrak {C}}\big |_{{\text {dom}}(\Gamma )}\), or equivalently, \({\mathfrak {C}}^{\circ t} \Gamma = {\mathfrak {C}}^t \big |_{{\text {dom}}(\Gamma )}\) for all \(t > 0\).

If \(\Gamma \) is bounded with a bounded inverse, then \(\Sigma \) and \(\Sigma ^\circ \) are said to be similar. (In this case the condition that \({\text {dom}}(\Gamma )=X\) is automatically satisfied.)

This definition is reproduced from [31,  Definition 9.2.1], but with the condition that the range of \(\Gamma \) is dense added and a couple of redundant assumptions dropped; observe that Staffans also states on page 512 of [31] that \(\Gamma ^{-1}\) is a pseudo-similarity if \(\Gamma \) is a pseudo-similarity, that property (1) in Definition 1.8 implies that \({\widetilde{{\mathfrak {B}}}}\) maps into \({\text {ran}}(\Gamma )\) and item (2) implies that \({\text {ran}}(\Gamma )\) is invariant under \({\widetilde{{\mathfrak {A}}}}^t\). Hence the tw o pseudo-similarity definitions are equivalent.

We make the following additional definitions:

  • For each \(\alpha \in {{\mathbb {R}}}\), we define \({{\mathbb {C}}}_\alpha :=\left\{ z\in {{\mathbb {C}}}\mid {\text {Re}}\,z>\alpha \right\} \) (so in particular \({{{\mathbb {C}}}^{+}}={{\mathbb {C}}}_0\)).

  • We let \(H^\infty ({{\mathbb {C}}}_\alpha ;{{\mathcal {B}}}(U,Y))\) denote the \({{\mathcal {B}}}(U,Y)\)-valued functions which are analytic and bounded on \({{\mathbb {C}}}_\alpha \).

Thus the Schur class consists of those functions \(F\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\) such that \(F(\lambda )\) is a contraction from U into Y for all \(\lambda \in {{{\mathbb {C}}}^{+}}\), and in this case we write \(F\in {{\mathcal {S}}}_{U,Y}\). In fact, for convenience, we identify two analytic functions which coincide on some set in the intersection of their domains which has an interior cluster point. In particular, we write \(F\in {{\mathcal {S}}}_{U,Y}\) if the restriction \(F\big |_{{\text {dom}}(F)\bigcap {{{\mathbb {C}}}^{+}}}\) has a unique extension to a function in \({{\mathcal {S}}}_{U,Y}\).

In the infinite dimensional situation, following [31] we use the frequency response idea at the beginning of the introduction to define the transfer function \({\widehat{{\mathfrak {D}}}}\) by the formula

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}(\lambda )u_0:= ({\overline{{\mathfrak {D}}}} e_\lambda u_0)(0), \quad \lambda \in {{\mathbb {C}}}_{\omega _{\mathfrak {A}}},\ u_0\in U, \end{aligned}$$

where \({\overline{{\mathfrak {D}}}}\) is a suitable version of the input/output map \({\mathfrak {D}}\); see Proposition 2.3 for the details. We can now formulate our first main result.

Theorem 1.9

(Standard infinite dimensional bounded real lemma) For a minimal well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) with transfer function \({\widehat{{\mathfrak {D}}}}\) the following are equivalent:

  1. (1)

    The transfer function satisfies \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}\) (in the generalized sense described above).

  2. (2)

    The continuous-time KYP-inequality has a ‘spatial’ solution H in the following sense: H is a closed, possibly unbounded, densely defined, and positive definite operator on X, such that for all \(t>0\):

    $$\begin{aligned} \begin{aligned} {\mathfrak {A}}^t\,{\text {dom}}(H^{\frac{1}{2}}) \subset {\text {dom}}(H^{\frac{1}{2}}),\quad {\mathfrak {B}}^t\,L^2([0,t];U) \subset {\text {dom}}(H^{\frac{1}{2}}), \end{aligned} \end{aligned}$$
    (1.12)

    and the following spatial form of the KYP-inequality holds:

    $$\begin{aligned} \left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}\begin{bmatrix}x\\ {{\mathbf {u}}}\end{bmatrix}\right\| \leqslant \left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix}\begin{bmatrix}x\\ {{\mathbf {u}}}\end{bmatrix}\right\| ,\quad \begin{bmatrix}x\\ {{\mathbf {u}}}\end{bmatrix}\in \begin{bmatrix}{\text {dom}}(H^{\frac{1}{2}})\\ L^2([0,t];U)\end{bmatrix}, \end{aligned}$$
    (1.13)

    where the norms are those of \(\left[ {\begin{matrix}X\\ L^2([0,t];Y) \end{matrix}}\right] \) and \(\left[ {\begin{matrix}X\\ L^2([0,t];U) \end{matrix}}\right] \), respectively.

  3. (3)

    The system \(\Sigma \) is pseudo-similar to a passive system.

  4. (4)

    The system \(\Sigma \) has a storage function.

  5. (5)

    The system \(\Sigma \) has a quadratic storage function.

When these equivalent conditions hold, an operator H defining a quadratic storage function in item (5) will also be a spatial solution of the KYP-inequality in item (2) and vice versa. For every pseudo-similarity \(\Gamma \) to a passive system, the operator \(H:=\Gamma ^*\Gamma \) is a spatial solution to the KYP-inequality in item (2) and it can serve as the operator defining the quadratic storage function in item (5).

Note that the spatial solution H of the KYP-inequality in item (2) of the preceding theorem is required to be independent of t.

In §3 below (see in particular Definition 3.7), we will introduce the concept of \(L^2\)-exact controllability and \(L^2\)-exact observability for continuous-time systems, which are weaker than exact controllability and exact observability in infinite time, but still strong enough to guarantee a bounded solution of the KYP-inequality. Thus we get the following alternative infinite dimensional version of the standard bounded real lemma, a result which we believe is new in the continuous-time setting:

Theorem 1.10

(\(L^2\)-minimal infinite dimensional bounded real lemma) For an \(L^2\)-minimal well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) with transfer function \({\widehat{{\mathfrak {D}}}}\), the following conditions are equivalent:

  1. (1)

    The transfer function of \(\Sigma \) satisfies \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}\).

  2. (2)

    A bounded, strictly positive definite solution H to the following standard KYP-inequality exists:

    $$\begin{aligned} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}^*\begin{bmatrix}H&{}0\\ 0&{}I\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix} \preceq \begin{bmatrix}H&{}0\\ 0&{}I\end{bmatrix},\quad t\geqslant 0, \end{aligned}$$
    (1.14)

    with the adjoint computed w.r.t. the inner product in \(L^2([0,t];K)\), where \(K=U\) or \(K=Y\).

  3. (3)

    The system \(\Sigma \) is similar to a passive system.

When these conditions hold, in fact \({{{\mathbb {C}}}^{+}}\subset {\text {dom}}({\widehat{{\mathfrak {D}}}})\), so that \({\widehat{{\mathfrak {D}}}}|_{{{{\mathbb {C}}}^{+}}}\) is itself in \({{\mathcal {S}}}_{U,Y}\), rather than just having a unique restriction-followed-by-extension in \({{\mathcal {S}}}_{U,Y}\).

For each bounded, strictly positive definite solution H to the KYP-inequality in item (2), the operator \(\Gamma :=H^{\frac{1}{2}}\) establishes similarity to a passive system as in item (3). Conversely, for every similarity \(\Gamma \) in item (3), \(H:=\Gamma ^*\Gamma \) is a bounded, strictly positive definite solution to the KYP-inequality in item (2).

All solutions H to the spatial KYP-inequality in item (2) of Theorem 1.9 are in fact bounded, strictly positive definite solutions of (1.14), and there exist bounded, strictly positive definite solutions \(H_a\) and \(H_r\) of (1.14) such that

$$\begin{aligned} H_a \preceq H \preceq H_r. \end{aligned}$$

Remark 1.11

The \(L^2\)-minimality assumption in Theorem 1.10 brings the results much closer to the finite dimensional setting, while only assuming minimality makes the situation more subtle. For instance, while each pseudo-similarity provides a spatial solution to the KYP-inequality (1.13), the converse may not hold, as it does not appear to be the case that every spatial KYP-solution H can be used to define a passive well-posed system \(\Sigma '\) via (1.7); see the proof of Theorem 1.10 for more details in the bounded case. Specifically, to prove strong continuity if the semigroup of the candidate passive system, more conditions seem necessary. Also, assuming only minimality, there are results on a ’largest’ and ’smallest’ solution to the spatial KYP-solution, but these serve as extremal solutions only for subclasses of spatial KYP-solutions; see Remark 7.5 below for more details.

It is straightforward to formulate a naive infinite dimensional version of the strict BRL. While the implications (2) \(\Leftrightarrow \) (3) and (2) \(\Rightarrow \) (1) are then straightforward, the implication (1) \(\Rightarrow \) (2) or (3) appears to require some extra hypotheses. We present three possible strengthenings of the hypothesis (1) so that the implication (1) \(\Rightarrow \) (2) or (3) holds in the infinite dimensional setting. The naive expectation is that one should strengthen the stability assumption on A in the discrete-time case to the assumption that the operator \(C_0\)-semigroup be exponentially stable for the continuous-time case. However this appears to be not sufficient in general. We shall additionally assume that the operator \(C_0\)-semigroup \(\{{\mathfrak {A}}^t \mid t \ge 0\}\) embeds into an operator \(C_0\)-group \(\{ {{\widetilde{{\mathfrak {A}}}}}^t \mid t \in {{\mathbb {R}}}\}\) (meaning that \(\{ {{\widetilde{{\mathfrak {A}}}}}^t \mid t \in {{\mathbb {R}}}\}\) is a \(C_0\)-group of operators such that \({{\widetilde{{\mathfrak {A}}}}}^t = {\mathfrak {A}}^t\) for \(t \ge 0\)). Equivalently, the \(C_0\)-semigroup {\({\mathfrak {A}}^t \mid t \ge 0\}\) is such that \({\mathfrak {A}}^t\) is invertible for some \(t>0\); see Proposition 5.2 below for additional information. We note that this invertibility condition always holds in finite dimensions, and hence the notions strict and semi strict collapse to one notion of strictness in the finite dimensional case.

In addition we introduce auxiliary operators

$$\begin{aligned} {\mathfrak {C}}^t_{1_X, A} :X \rightarrow L^2([0,t], X), \quad {\mathfrak {D}}^t_{A,B} :L^2([0,t]; U) \rightarrow L^2([0,t]; X) \end{aligned}$$

given by

$$\begin{aligned}&{\mathfrak {C}}^t_{1_X, A} :x \mapsto ( s \rightarrow 1_X {\mathfrak {A}}^s x = {\mathfrak {A}}^s x)_{0 \le s \le t} \in L^2([0,t], X), \\&{\mathfrak {D}}_{A,B}^t :( s \rightarrow {{\mathbf {u}}}(s))_{0 \le s \le t} \mapsto \left( s \rightarrow \int _0^s {\mathfrak {A}}^{s-r} B {{\mathbf {u}}}(r) \,{\mathrm {d}}r\right) _{0 \le s \le t}. \end{aligned}$$

Here \( \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) is the system node associated with the well-posed system (details in §4 below) and we shall be assuming that the \(C_0\)-semigroup \({\mathfrak {A}}^t\) generated by A is exponentially stable. Under these conditions the state trajectories \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) associated with \(\Sigma \) are such that \({{\mathbf {x}}}\in L^2({{\mathbb {R}}}^+, X)\) and \({{\mathbf {y}}}\in L^2({{\mathbb {R}}}^+, Y)\) as long as \({{\mathbf {u}}}\in L^2({{\mathbb {R}}}^+, U)\). In system-trajectory terms, the operator \(\begin{bmatrix} {\mathfrak {C}}^t_{1_X, A}&{\mathfrak {B}}^t_{A,B} \end{bmatrix}\) has the following property: if \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) is any system trajectory, then

$$\begin{aligned} \begin{bmatrix} {\mathfrak {C}}^t_{1_X, A}&{\mathfrak {D}}^t_{A,B} \end{bmatrix} :\begin{bmatrix} {{\mathbf {x}}}(0) \\ {{\mathbf {u}}}|_{[0,t]} \end{bmatrix} \rightarrow {{\mathbf {x}}}|_{[0,t]} \in L^2([0,t], X) \end{aligned}$$
(1.15)

Our version of the strict BRL for the infinite dimensional continuous-time setting is as follows:

Theorem 1.12

(Infinite dimensional strict bounded real lemma) Consider the following statements for a well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \):

  1. (1)

    The transfer function \({\widehat{{\mathfrak {D}}}}\) of \(\Sigma \) is in \({{\mathcal {S}}}_{U,Y}^0\) and \({{{\mathbb {C}}}^{+}}\subset {\text {dom}}({\widehat{{\mathfrak {D}}}})\).

  2. (2a)

    There exists a bounded on X which satisfies the strict KYP-inequality associated with \(\Sigma \), i.e., there is a \(\delta >0\) such that

    $$\begin{aligned}&\begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}^*\begin{bmatrix}H&{}0\\ 0&{}1_{L^2([0,t],Y)}\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix} \nonumber \\&\quad + \delta \begin{bmatrix} ({\mathfrak {C}}_{1_X,A}^t)^*) \\ ({\mathfrak {D}}_{A,B}^t)^* \end{bmatrix} \begin{bmatrix} {\mathfrak {C}}_{1_X,A}^t&{\mathfrak {D}}_{A,B}^t \end{bmatrix} \preceq \begin{bmatrix}H&{}0\\ 0&{} (1 - \delta ) 1_{L^2([0,t], U)}\end{bmatrix},\quad t>0. \end{aligned}$$
    (1.16)
  3. (2b)

    There exists a bounded on X which satisfies the semi-strict KYP-inequality for \(\Sigma \), i.e., there is a \(\delta > 0\) so that for all \(t> 0\) we have:

    $$\begin{aligned}&\begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}^*\begin{bmatrix}H&{}0\\ 0&{}1_{L^2([0,t],Y)}\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix} \preceq \begin{bmatrix}H&{}0\\ 0&{} (1 - \delta ) 1_{L^2([0,t], U)}\end{bmatrix}. \end{aligned}$$
    (1.17)
  4. (3a)

    \(\Sigma \) is similar to a strictly passive system, i.e., one satisfying (1.16) with \(H=1_X\) and some \(\delta >0\).

  5. (3b)

    \(\Sigma \) is similar to a semi-strictly passive system, i.e., one satisfying (1.17) with \(H = 1_X\).

  6. (4a)

    \(\Sigma \) has a finite-valued, coercive, quadratic, strict storage function.

  7. (4b)

    \(\Sigma \) has a finite-valued, coercive, quadratic, semi-strict storage function.

  8. (5a)

    \(\Sigma \) has a strict storage function.

  9. (5b)

    \(\Sigma \) has a semi-strict storage function.

Then we have the following implications:

$$\begin{aligned} \begin{array}{ccccccccc} (2a)&{}\Longleftrightarrow &{}(3a)&{}\Longleftrightarrow &{}(4a)&{}\Longrightarrow &{}(5a)\\ \Downarrow &{}&{}\Downarrow &{}&{}\Downarrow &{}&{}\Downarrow \\ (2b)&{}\Longleftrightarrow &{}(3b)&{}\Longleftrightarrow &{}(4b)&{}\Longrightarrow &{}(5b)&{}\Longrightarrow &{}(1). \end{array} \end{aligned}$$

Furthermore, all 9 statements in the list (1)– (5) are equivalent if we assume in addition that \({\mathfrak {A}}^t\) is exponentially stable and at least one of the following three conditions holds:

  1. (H1)

    \({\mathfrak {A}}^t\) can be embedded into a \(C_0\)-group;

  2. (H2)

    \(\Sigma \) is \(L^2\)-controllable;

  3. (H3)

    \(\Sigma \) is \(L^2\)-observable.

Remark 1.13

Let us sketch here the connection between the strict operator KYP-inequality (1.16) and the strict storage-function inequality (1.8).

As already observed in Remark 1.7, \(H \succeq 0\) being bounded corresponds to the associated quadratic storage function \(S_H(x) = \Vert H^{\frac{1}{2}}x \Vert ^2\) being finite-valued on X, and corresponds to \(S_H\) being coercive.

Given a well-posed system \(\Sigma \), by the definition of the \(\left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) system trajectories, \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) are determined from the initial condition \({{\mathbf {x}}}(0) = x_0\) and the input signal \({{\mathbf {u}}}\) according to

$$\begin{aligned} {{\mathbf {x}}}(t)&= {\mathfrak {A}}^t x_0 + {\mathfrak {B}}^t {{\mathbf {u}}}|_{[0,t]} \\ {{\mathbf {y}}}(t)&= {\mathfrak {C}}^t x_0 + {\mathfrak {D}}^t {{\mathbf {u}}}|_{[0,t]},\quad t \ge 0. \end{aligned}$$

If we look at the quadratic form coming from the selfadjoint operator on the left-hand side of the operator inequality (1.16) evaluated at \(\left[ {\begin{matrix}{{\mathbf {x}}}(0) \\ {{\mathbf {u}}}|_{[0,t]} \end{matrix}}\right] \) coming from a system trajectory \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\), we get

$$\begin{aligned}&\langle H ({\mathfrak {A}}^t x_0 + {\mathfrak {B}}^t {{\mathbf {u}}}|_{[0,t]}), {\mathfrak {A}}^t x_0 + {\mathfrak {B}}^t {{\mathbf {u}}}|_{[0,t]} \rangle _X + \Vert {\mathfrak {C}}^t x_0 + {\mathfrak {D}}^t {{\mathbf {u}}}|_{[0,t]} \Vert ^2_{L^2([0,t], Y)} \\&\quad + \delta \Vert {{\mathbf {x}}}|_{[0,t]} \Vert ^2_{L^2([0,t], X)} = \langle H {{\mathbf {x}}}(t), {{\mathbf {x}}}(t) \rangle _X + \Vert {{\mathbf {y}}}|_{[0,t]} \Vert ^2_{L^2([0,t],Y)} + \delta \Vert {{\mathbf {x}}}|_{[0,t]} \Vert ^2_{L^2([0,t],X)} \end{aligned}$$

while the right-hand side gives us

$$\begin{aligned} \langle H {{\mathbf {x}}}(0), {{\mathbf {x}}}(0) \rangle _X + (1 - \delta ) \Vert {{\mathbf {u}}}\Vert ^2_{L^2([0,t],U)}\,. \end{aligned}$$

Thus the strict KYP-inequality (1.16) for a bounded , when viewed in terms of the respective quadratic forms evaluated at \(\left[ {\begin{matrix} {{\mathbf {x}}}(0) \\ {{\mathbf {u}}}|_{[0,t]} \end{matrix}}\right] \), becomes exactly

$$\begin{aligned}&\langle H {{\mathbf {x}}}(t), {{\mathbf {x}}}(t) \rangle _X + \Vert {{\mathbf {y}}}|_{[0,t]} \Vert ^2_{L^2([0,t],Y)} + \delta \Vert {{\mathbf {x}}}|_{[0,t]} \Vert ^2_{L^2([0,t],X)} \\&\quad \le \langle H {{\mathbf {x}}}(0), {{\mathbf {x}}}(0) \rangle _X + (1 - \delta ) \Vert {{\mathbf {u}}}|_{[0,t]} \Vert ^2_{L^2([0,t],U)}. \end{aligned}$$

Setting \(S_H(x) = \Vert H^{\frac{1}{2}} x \Vert ^2 = \langle H x , x \rangle \), we see that the last inequality is exactly the defining inequality (1.8) for \(S_H\) to be a strict storage function. Thus the class of bounded satisfying the strict KYP inequality (1.16) is exactly the class of H for which the associated quadratic function \(S_H\) is a finite-valued, coercive strict storage functions for \(\Sigma \).

A similar analysis gives the corresponding statement for the semi-strict setting: the class of bounded satisfying the semi-strict KYP inequality (1.17) is exactly the class of H for which the associated quadratic function \(S_H\) is a finite-valued, coercive, semi-strict storage function.

Arov and Staffans [5] also treat the standard BRL for infinite dimensional, continuous-time systems (Theorem 1.9 above), but from a complementary point of view. There the authors introduce system nodes \( \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) first, and then define the associated system (and the associated operators \(\Sigma = \left[ {\begin{matrix}{\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \)) through smooth system trajectories associated with the system-node trajectories. They introduce the notion of pseudo-similarity at the level of system nodes and obtain the equivalence of pseudo-similarity to a dissipative system node with the existence of a solution to a spatial KYP-inequality expressed directly in terms of the system node operators (a spatial infinite dimensional analogue of the spatial KYP-inequality (1.6)). To complete the analysis they use Cayley transform computation to reduce the result to the discrete-time situation studied in [2] (see Remark 4.5 below for additional details). In the present paper, on the other hand, all details are worked out directly in the continuous-time systems setting rather than using Cayley transforms to map into discrete time. This is necessary in our stydy of the strict BRL, because exponential stability in continuous time is in general not mapped into exponential stability in discrete time; see Example 5.5 below.

We extend the concept of \(L^2\)-storage function originally introduced by Willems [33, 34] and developed further for discrete-time infinite dimensional systems in [10] to continuous-time, infinite dimensional systems. We show that Willems’ available storage function \(S_a\) (see [33, 34]) is of a special type which we call \(L^2\)-regular, whereas Willems’ required supply \(S_r\) is not. In response to the latter, we introduce an \(L^2\)-regularized version \({\underline{S}}_r\leqslant S_r\) of the required supply and prove that all \(L^2\)-regularized storage functions S satisfy \(S_a\leqslant S\leqslant {\underline{S}}_r\) under some additional assumptions. Moreover, we prove that \(S_a\) and \({\underline{S}}_r\) are quadratic. Our variational approach to the explicit solution of the density operators determining \(S_a\) and \({\underline{S}}_r\) in §6 is much in the same spirit as in the discussion in [23,  §3].

Extensions to the infinite dimensional, Hilbert space setting were begun already by Yakubovich in [37, 38], but the theory has been systematized and refined in many iterations after these seminal papers. The paper of Curtain [12] for instance treats the strict BRL for the case where “B and C are bounded" (i.e., \(B \in {{\mathcal {B}}}(U,X)\) and \(C \in {{\mathcal {B}}}(X,Y)\)) and the resulting feedthrough operator \(D \in {{\mathcal {B}}}(U,Y)\) is taken to be 0. Her KYP-inequality can be seen (via a Schur-complement calculation) to be contained in our strict KYP-inequality criterion (see (4.8) below) when specialized to her situation.

In addition to the BRL as presented here, the so-called KYP lemma appears in the context of many other topics in control theory. e.g., the design of a certain type of Lyapunov function leading to stabilization of a linear system via a nonlinear state-feedback control as in the original problem of Lur’e, linear-quadratic optimization problems, feedback design, etc.; we refer to [17] for an informative survey. The paper [18] for instance gives a far-reaching extension of the original form of the KYP-lemma, allowing the FDIs to be given only on finite frequency intervals and the class of systems allowed to be more general, by exploiting the S-procedure, which also goes back to work of Yakubovich (see [14, 36]).

The Bounded Real Lemma (more generally the KYP lemma) has now been adapted to a number of additional applications. Let us mention that, specifically, in [16], the bounded real lemma is applied to model reduction, more precisely to balanced bounded real truncation, and the relation of the minimal and maximal storage functions to optimal control theory is described; see also [30] for this connection and an alternative version of the strict bounded real and positive real lemmas. Finally, we mention that there is also an extension [11] of the present approach to discrete-time dichotomous and bicausal systems, where it is essential that solutions of the KYP-inequality be indefinite; such a situation is considered for both discrete-time and continuous-time systems in [26] to handle applications where a stabilizability assumption is missing. It should be of interest to extend the results here to the dichotomous setting, thereby getting a continuous-time analogue of [11].

The paper is organized as follows. In §2, the basics of well-posed systems are recalled. In §4 the complementary differential approach via system nodes is reviewed, because some issues coming up in the sequel are more easily resolved via the system-node approach. In §3 we develop the concept of \(L^2\)-minimality for the continuous-time setting (analogous to developments in [10] for the discrete-time setting). Some examples of \(L^2\)-minimal systems are discussed in §5. In §6, we extend the concept of \(L^2\)-regularized storage function from [10] to continuous time and we use this to study \(S_a\) and \({\underline{S}}_r\). Finally, in § 8 we prove our main results stated in the present introduction. Part of the proofs are based on an operator optimization problem, which is the topic of Appendix A.

Notation and terminology. For \(t \in {{\mathbb {R}}}\), we define the backward shift operator \(\tau ^t\) acting on a function \({{\mathbf {u}}}\) with \({\text {dom}}({{\mathbf {u}}})\subset {{\mathbb {R}}}\) by

$$\begin{aligned} (\tau ^t {{\mathbf {u}}})(s)={{\mathbf {u}}}(t+s),\qquad s\in {{\mathbb {R}}},\, t+s\in {\text {dom}}({{\mathbf {u}}}). \end{aligned}$$

Given \(J\subset {{\mathbb {R}}}\), we define the projection \(\pi _J\) acting on a function \({{\mathbf {u}}}\) with \(J\subset {\text {dom}}({{\mathbf {u}}})\subset {{\mathbb {R}}}\) by

$$\begin{aligned} (\pi _Ju)(s):={\left\{ \begin{array}{ll} {{\mathbf {u}}}(s),\quad s\in J, \\ 0,\quad s\in {{\mathbb {R}}}\setminus J.\end{array}\right. } \end{aligned}$$

Set \({{\mathbb {R}}}^+:=[0,\infty )\) and \({{\mathbb {R}}}^-:=(-\infty ,0)\). We abbreviate \(\pi _+:=\pi _{{{\mathbb {R}}}^+}\), \(\pi _-:=\pi _{{{\mathbb {R}}}^-}\) and define \(\tau _+^t:=\pi _+\tau ^t\) and \(\tau _-^t:=\tau ^t\pi _-\) for \(t\geqslant 0\), both acting on functions with support anywhere in \({{\mathbb {R}}}\). The multiplicative interaction between these operations is given by

$$\begin{aligned} \tau ^t \pi _J =\pi _{J+t} \tau ^t,\quad t\in {{\mathbb {R}}},\, J\subset {{\mathbb {R}}},\quad \text{ with } J+t:=\{x+t \mid x\in J\}. \end{aligned}$$

Furthermore, we let denote the reflection operator:

(1.18)

Let K be a Hilbert space. For every, not necessarily bounded, interval \(J\subset {{\mathbb {R}}}\) we write \(L^2(J;K)\) for the usual Hilbert space of K-valued measurable, square integrable functions on J with values in K, considering this space as a subspace of \(L^2_K:=L^2({{\mathbb {R}}};K)\) by zero extension, without writing out the injection explicitly. We abbreviate \(L^{2+}_K:=L^2({{\mathbb {R}}}^+;K)\), and \(L^{2-}_K:=L^2({{\mathbb {R}}}^-;K)\). With \(L^2_{loc,K}\) we denote the space of K-valued measurable functions \({{\mathbf {u}}}\) such that \(\pi _J {{\mathbf {u}}}\in L^2_K\) for every bounded interval J. The symbols \(L^2_{\ell ,K}\), \(L^2_{r,K}\) and \(L^2_{\ell ,r,K}\) stand for the spaces of functions \({{\mathbf {u}}}\in L^2_K\) with support bounded to the left (\({\text {supp}}({{\mathbf {u}}})\subset (L,\infty )\) for some \(L\in {{\mathbb {R}}}\)), support bounded to the right (\({\text {supp}}({{\mathbf {u}}})\subset (-\infty ,L)\) for some \(L\in {{\mathbb {R}}}\)), or with support bounded on both sides, respectively. Similarly we define \(L^2_{\ell ,loc,K}\), \(L^2_{r,loc,K}\), \(L^{2\pm }_{loc,K}\), \(L^{2\pm }_{\ell ,K}\), etc. However, note that some spaces may coincide, e.g., \(L^2_{\ell ,r,loc,K}=L^2_{\ell ,r,K}\), \(L^{2+}_{\ell ,loc,K}=L^{2+}_{loc,K}\), \(L^{2+}_{r,loc,K}=L^{2+}_{r,K}\), etc. Convergence of \({{\mathbf {z}}}_k\) to \({{\mathbf {z}}}\) in \(L^2_{\ell ,loc,K}\) means that there is some \(L\in {{\mathbb {R}}}\) such that \({\text {supp}}({{\mathbf {z}}}), {\text {supp}}({{\mathbf {z}}}_k)\subset (L,\infty )\) for all k, and \(\pi _{[L,T]}{{\mathbf {z}}}_k\rightarrow {{\mathbf {z}}}\) in \(L^2_K\) for all \(T>L\), and convergence in \(L^2_{r,loc,K}\) is defined similarly. Moreover, \(L^{2-}_{\ell ,K}=L^{2-}_{\ell ,loc,K}\) and \(L^{2+}_{loc,K}=L^{2+}_{\ell ,loc,K}\) are considered as subspaces of \(L^{2}_{\ell ,loc,K}\) with support contained in \({\overline{{{{\mathbb {R}}}^{-}}}}\) and \({{{\mathbb {R}}}^{+}}\), respectively, and we let these spaces inherit the topology of \(L^{2}_{\ell ,loc,K}\). For an interval \(J\subset {{\mathbb {R}}}\), we write C(JK) for the space of continuous functions on J with values in K.

Throughout, for Hilbert spaces U and V we write \({{\mathcal {B}}}(U,V)\) for the Banach space of bounded linear operators mapping U into V with the operator norm simply denoted by \(\Vert \ \Vert \). For a contraction operator T in \({{\mathcal {B}}}(U,V)\), that is, with \(\Vert T\Vert \leqslant 1\), we write \(D_T\) for the defect operator of T which is defined to be the unique positive semidefinite square root of the bounded, positive semidefinite operator \(I-T^*T\), i.e., \(D_T := ( I - T^* T)^{\frac{1}{2}}\).

2 Well-posed Linear Systems

In this section we provide some background on well-posed systems, more specifically, causal, time-invariant \(L^2\)-well-posed linear systems. We recall this class of systems in Definition 2.1; for a more detailed study and motivation of this class of systems we refer the reader to [31]. It may be a helpful experience for the reader to verify that the system determined by (1.1) and (1.11) fits Definitions 2.1 and 2.2 below.

Definition 2.1

Let U, X and Y be separable Hilbert spaces. A quadruple \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) is called a well-posed system if it has the following properties:

  1. (1)

    The symbol \({\mathfrak {A}}\) indicates a family \(t\mapsto {\mathfrak {A}}^t\), which is a \(C_0\)-semigroup on X.

  2. (2)

    The input map \({\mathfrak {B}}:L^{2-}_{\ell ,U}\rightarrow X\) is a linear map satisfying \({\mathfrak {A}}^t{\mathfrak {B}}={\mathfrak {B}}\tau _-^t\) on \(L^{2-}_{\ell ,U}\), for all \(t\geqslant 0\).

  3. (3)

    The output map \({\mathfrak {C}}:X\rightarrow L^{2+}_{loc,Y}\) is a linear map satisfying \({\mathfrak {C}}{\mathfrak {A}}^t=\tau _+^t{\mathfrak {C}}\) on X, for all \(t\geqslant 0\).

  4. (4)

    The transfer map (input/output map) \({\mathfrak {D}}:L^2_{\ell ,loc,U}\rightarrow L^2_{\ell ,loc,Y}\) is a linear map satisfying the following identities on \(L^2_{\ell ,loc,U}\):

    1. (a)

      \(\tau ^t{\mathfrak {D}}={\mathfrak {D}}\tau ^t\) for all \(t\in {{\mathbb {R}}}\) (time invariance),

    2. (b)

      \(\pi _-{\mathfrak {D}}\pi _+=0\) (causality) and

    3. (c)

      \(\pi _+{\mathfrak {D}}\pi _-={\mathfrak {C}}{\mathfrak {B}}\pi _-\) (Hankel operator factorization).

  5. (5)

    The operators \({\mathfrak {B}}\), \({\mathfrak {C}}\), and \({\mathfrak {D}}\) are continuous with respect to the topology of \(L^2_{\ell ,loc}\).

We remark that the intertwinement in condition (2) in the preceding definition, \({\mathfrak {A}}^t{\mathfrak {B}}{{\mathbf {u}}}={\mathfrak {B}}\tau _-^t {{\mathbf {u}}}\) for \({{\mathbf {u}}}\in L^{2-}_{\ell ,U}\), is written in this form in [31,  Definition 2.2.1], but in fact the projection in \(\tau _-^t=\tau ^t\pi _-\) is redundant for such \({{\mathbf {u}}}\), since \(\pi _-{{\mathbf {u}}}={{\mathbf {u}}}\). It is also possible to consider \({\mathfrak {B}}\) as an operator with domain \(L^2_{\ell ,loc,U}\), without breaking this intertwinement property, by setting \({\mathfrak {B}}:={\mathfrak {B}}\pi _-\); however, we do not make this convention here. On the other hand, \({\mathfrak {C}}\) can be interpreted as an operator from X into \(L^2_{\ell ,loc,Y}\), since \(L^{2+}_{loc,Y}\) can be identified with a subspace of \(L^2_{\ell ,loc,Y}\) by zero extension on \({{{\mathbb {R}}}^{-}}\).

Given the well-posed system \(\Sigma \), we define

$$\begin{aligned} \begin{aligned} {\mathfrak {B}}^t&:={\mathfrak {B}}\pi _-\tau ^t|_{L^2([0,t],U)}:L^2([0,t],U)\rightarrow X,\quad t\in {{\mathbb {R}}}^+,\\ {\mathfrak {C}}^t&:=\pi _{[0,t]}{\mathfrak {C}}:X\rightarrow L^2([0,t],Y),\quad t\in {{\mathbb {R}}}^+,\qquad \text {and}\\ {\mathfrak {D}}^t&:=\pi _{[0,t]}{\mathfrak {D}}|_{L^2([0,t],U)}:L^2([0,t],U)\rightarrow L^2([0,t],Y),\quad t\in {{\mathbb {R}}}^+. \end{aligned} \end{aligned}$$
(2.1)

In order to stay compatible with the notation in [31], we abbreviate \({\mathfrak {B}}^t\pi _{[0,t]}{{\mathbf {u}}}\) to \({\mathfrak {B}}^t {{\mathbf {u}}}\), so that we can apply \({\mathfrak {B}}^t\) to arbitrary \({{\mathbf {u}}}\in L^2_{loc,U}\) rather than only \({{\mathbf {u}}}\in L^2([0,t];U)\). Note that we divert in (2.1) from the notation in [31,  Definition 2.2.6]: what we define as \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\) and \({\mathfrak {D}}^t\) corresponds to \({\mathfrak {B}}^t_0\), \({\mathfrak {C}}^t_0\) and \({\mathfrak {D}}^t_0\) in [31], with the additional feature that we restrict \({\mathfrak {B}}^t\) and \({\mathfrak {D}}^t\) to functions in \(L^2([0,t],U)\).

With a slight modification of the formulas in [31,  Theorem 2.2.14] it is possible to recover \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\) from \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\) and \({\mathfrak {D}}^t\) via:

$$\begin{aligned} \begin{aligned} {\mathfrak {B}}{{\mathbf {u}}}&= \lim _{t\rightarrow \infty } {\mathfrak {B}}^t \tau ^{-t} \pi _{[-t,0]}{{\mathbf {u}}}, \ \ {{\mathbf {u}}}\in L^{2-}_{\ell ,U},\quad {\mathfrak {C}}x=\lim _{t\rightarrow \infty } {\mathfrak {C}}^t x,\ \ x\in X,\\ {\mathfrak {D}}{{\mathbf {u}}}&= \lim _{t\rightarrow \infty } \tau ^{t} {\mathfrak {D}}^{2t} \tau ^{-t} \pi _{[-t,t]} {{\mathbf {u}}}, \quad {{\mathbf {u}}}\in L^{2}_{\ell ,loc,U}. \end{aligned} \end{aligned}$$
(2.2)

The limits for \({\mathfrak {B}}\) and \({\mathfrak {C}}\) follow from Theorem 2.2.14 in [31]. For \({\mathfrak {D}}\), a slightly different argument is needed, which we will now give. Fix \({{\mathbf {u}}}\in L^2_{\ell ,loc,U}\) and let L be such that \({\text {supp}}({{\mathbf {u}}})\subset [L,\infty )\). For all \(t>|L|\), we then get from the time invariance and causality of \({\mathfrak {D}}\) that

$$\begin{aligned} \tau ^{t} {\mathfrak {D}}^{ 2t} \tau ^{-t} \pi _{[-t,t]} {{\mathbf {u}}}= \tau ^{t} \pi _{[0,2t]} {\mathfrak {D}}\tau ^{-t} \pi _{[L,t]} {{\mathbf {u}}}= \pi _{[-t,t]} \tau ^{-t} \tau ^{t} {\mathfrak {D}}\pi _{[L,t]} {{\mathbf {u}}}= \pi _{[L,t]} {\mathfrak {D}}\pi _{[L,t]} {{\mathbf {u}}}. \end{aligned}$$

Now fix \(T>L\) arbitrarily. When \(t\rightarrow \infty \), we get \(\pi _{[L,T]}\pi _{[L,t]}{{\mathbf {u}}}=\pi _{[L,T]}{{\mathbf {u}}}\) for all \(t>T\), so that \(\pi _{[L,t]}{{\mathbf {u}}}\rightarrow {{\mathbf {u}}}\) in \(L^2_{\ell ,loc,U}\). By the continuity of \({\mathfrak {D}}\), we then get for \(t>\max \{|L|,|T|\}\) that

$$\begin{aligned} \pi _{[L,T]}\tau ^{t} {\mathfrak {D}}^{2t} \tau ^{-t} \pi _{[-t,t]} {{\mathbf {u}}}= \pi _{[L,T]}{\mathfrak {D}}\pi _{[L,t]} {{\mathbf {u}}}\rightarrow \pi _{[L,T]}{\mathfrak {D}}{{\mathbf {u}}}. \end{aligned}$$

Hence, in \(L^2_{\ell ,loc,Y}\), we have

$$\begin{aligned} \lim _{t\rightarrow \infty } \tau ^{t} {\mathfrak {D}}^{2t} \tau ^{-t} \pi _{[-t,t]} {{\mathbf {u}}}= {\mathfrak {D}}{{\mathbf {u}}}. \end{aligned}$$

Next we define what we mean by a solution, or a trajectory, of a well-posed system.

Definition 2.2

By an (input/state/output) trajectory on \({{{\mathbb {R}}}^{+}}\) of a well-posed linear system \(\Sigma \) with initial state \(x_0\in X\), we mean a triple \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) with input signal \({{\mathbf {u}}}\in L^{2+}_{loc,U}\), state signal \({{\mathbf {x}}}\in C({{{\mathbb {R}}}^{+}};X)\) and output signal \({{\mathbf {y}}}\in L^{2+}_{loc,Y}\) that satisfies

$$\begin{aligned} \begin{aligned} {{\mathbf {x}}}(t)&= {\mathfrak {A}}^tx_0+{\mathfrak {B}}^t \pi _{[0,t]} {{\mathbf {u}}},\quad t\geqslant 0, \\ {{\mathbf {y}}}&={\mathfrak {C}}x_0+{\mathfrak {D}}\pi _+ {{\mathbf {u}}}={\mathfrak {C}}x_0+{\mathfrak {D}}{{\mathbf {u}}}. \end{aligned} \end{aligned}$$
(2.3)

By an (input/state/output) trajectory of \(\Sigma \) on \({{\mathbb {R}}}\) (with initial state \(x_{-\infty }=0\)) we mean a triple \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) with input signal \({{\mathbf {u}}}\in L^2_{\ell ,loc,U}\), state trajectory \({{\mathbf {x}}}\in C({{\mathbb {R}}};X)\) and output signal \({{\mathbf {y}}}\in L^2_{\ell ,loc,Y}\) that satisfies

$$\begin{aligned} {{\mathbf {x}}}(t):={\mathfrak {B}}\pi _- \tau ^t {{\mathbf {u}}},\quad t\in {{\mathbb {R}}},\qquad {{\mathbf {y}}}:={\mathfrak {D}}{{\mathbf {u}}}. \end{aligned}$$
(2.4)

Note that a trajectory \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) on \({{{\mathbb {R}}}^{+}}\) is uniquely determined by the initial state \(x_0\) and the input signal \({{\mathbf {u}}}\), while a trajectory on \({{\mathbb {R}}}\) is uniquely determined by \({{\mathbf {u}}}\), and then one can intuitively think of \(\lim _{t\rightarrow -\infty } {{\mathbf {x}}}(t)=0\) as a kind of initial state. We mention a few rules on how trajectories on \({{\mathbb {R}}}\) and \({{\mathbb {R}}}^+\) can be manipulated, which will be useful in the sequel. The proofs are straightforward and left to the reader.

  1. (1)

    If \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a trajectory on \({{\mathbb {R}}}\) and \(t\in {{\mathbb {R}}}\) with \({{\mathbf {x}}}(t)=0\), then \(\pi _{[t,\infty )}({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is also a trajectory on \({{\mathbb {R}}}\).

  2. (2)

    A triple \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a trajectory on \({{\mathbb {R}}}\) if and only if the support of \({{\mathbf {u}}}\) is bounded to the left by some \(t\in {{\mathbb {R}}}\) and \(\tau ^t( {{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) is a trajectory on \({{{\mathbb {R}}}^{+}}\) with initial state zero.

  3. (3)

    The triple \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a trajectory on \({{\mathbb {R}}}\) if and only if \(\tau ^s({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a trajectory on \({{\mathbb {R}}}\) for some/all \(s\in {{\mathbb {R}}}\).

  4. (4)

    If \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) and \(({{\mathbf {v}}},{{\mathbf {z}}},{{\mathbf {w}}})\) are trajectories on \({{{\mathbb {R}}}^{+}}\) and \({{\mathbf {x}}}(t)={{\mathbf {z}}}(0)\) for some \(t>0\) then \(\pi _{[0,t)}({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})+ \tau ^{-t}({{\mathbf {v}}},{{\mathbf {z}}},{{\mathbf {w}}})\) is a trajectory on \({{{\mathbb {R}}}^{+}}\).

  5. (5)

    If \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a trajectory on \({{\mathbb {R}}}\) and \(({{\mathbf {v}}},{{\mathbf {z}}},{{\mathbf {w}}})\) is a trajectory on \({{{\mathbb {R}}}^{+}}\) with \({{\mathbf {z}}}(0)={{\mathbf {x}}}(0)\) then \(\pi _-({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})+({{\mathbf {v}}},{{\mathbf {z}}},{{\mathbf {w}}})\) is a trajectory on \({{\mathbb {R}}}\).

In order to discuss additional features of the well-posed system \(\Sigma \), we need an alternative representation of \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\), as bounded linear Hilbert space operators, and we now proceed to construct this representation. First set \(e_\lambda (t):=e^{\lambda t}\) for \(\lambda \in {{\mathbb {C}}},\) \(t\in {{\mathbb {R}}}\), and define the Hilbert space \(L^2_{\omega ,K}\) by

$$\begin{aligned} L^2_{\omega ,K}=\left\{ e_{\omega } {{\mathbf {u}}}\mid {{\mathbf {u}}}\in L^2_K \right\} \text { with } \left\langle e_{\omega } {{\mathbf {u}}} , e_{\omega } {{\mathbf {v}}} \right\rangle _{L^2_{\omega ,K}}:=\left\langle {{\mathbf {u}}} , {{\mathbf {v}}} \right\rangle _{L^2_K} \text { for } {{\mathbf {u}}},{{\mathbf {v}}}\in L^2_K. \end{aligned}$$

Similarly we define \(L^{2\pm }_{\omega ,K}\) by replacing \(L^2_K\) by \(L^{2\pm }_K\). Note that, as sets, we have the inclusions \(L^2_{\ell ,r,K}\subset L^2_{\omega ,K}\subset L^2_{loc,K}\) for all \(\omega \in {{\mathbb {R}}}\), with each inclusion being dense in their respective topologies, with similar dense inclusions for the corresponding \(L^{2\pm }\)–spaces.

It is well-known, see e.g., Theorem 2.5.4 in [31], that every \(C_0\)-semigroup \({\mathfrak {A}}\) has a growth bound

$$\begin{aligned} \omega _{\mathfrak {A}}:=\lim _{t\rightarrow \infty } \frac{\ln \Vert {\mathfrak {A}}^t\Vert }{t}<\infty , \end{aligned}$$
(2.5)

meaning that for every \(\omega >\omega _{\mathfrak {A}}\) there is some \(M>0\) such that \(\Vert {\mathfrak {A}}^t\Vert \leqslant Me^{\omega t}\) for all \(t\geqslant 0\). We call \(\Sigma \), or \({\mathfrak {A}}\), exponentially stable if \(\omega _{\mathfrak {A}}<0\). In this connection, we also point out that a passive system has a contractive semigroup, i.e., \(\Vert {\mathfrak {A}}^t\Vert \leqslant 1\) for all \(t\geqslant 0\), and this implies that \(\omega _{\mathfrak {A}}\leqslant 0\). In particular, all \(\alpha \in {{\mathbb {C}}}_{\omega _{\mathfrak {A}}}\) lie in the resolvent set \(\rho (A)\) of the generator A of \({\mathfrak {A}}\), meaning that \(\alpha -A\) has a bounded inverse on the state space X; see [31,  Theorem 3.2.9(i)].

Fix a real number \(\omega \). In case \(\omega >0\), then \(L^{2-}_{\omega ,K}\subset L^{2-}_{K}\) with dense and continuous embedding, and \(L^{2-}_{-\omega ,K}\) is the dual of \(L^{2-}_{\omega ,K}\) with pivot space \(L^{2-}_{K}\), so that the duality pairing satisfies

$$\begin{aligned} \left\langle {{\mathbf {v}}} , {{\mathbf {u}}} \right\rangle _{L^{2-}_{-\omega ,K},L^{2-}_{\omega ,K}}= \left\langle {{\mathbf {v}}} , {{\mathbf {u}}} \right\rangle _{L^{2-}_{K}},\quad {{\mathbf {v}}}\in L^{2-}_{K},\, {{\mathbf {u}}}\in L^{2-}_{\omega ,K}. \end{aligned}$$
(2.6)

See for instance [31,  §3.6] or [32,  §2.9] for detailed constructions of the dual with respect to a pivot space. If we have an exponentially stable system, then it is possible to take \(\omega =0\) and in that case the three spaces in (2.6) coincide. In fact, for an exponentially stable system it is possible to take \(\omega <0\), in which case instead \(L^{2-}_{-\omega ,K}\) is the densely and continuously embedded subspace and \(L^{2-}_{\omega ,K}\) is the dual subspace of \(L^{2-}_{K}\). Then, for \(L^{2+}_K\) the embeddings are reversed, so that \(L^{2+}_{-\omega ,K}\subset L^{2+}_{K}\subset L^{2+}_{\omega ,K}\) and \(L^{2+}_{\omega ,K}\subset L^{2+}_{K}\subset L^{2+}_{-\omega ,K}\), and duality pairings with respect to the pivot space \(L^{2+}_{K}\) exist in analogy to (2.6).

Let now \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) be a well-posed system and fix a real number \(\omega \) with \(\omega >\omega _{\mathfrak {A}}\). By Theorem 2.5.4 in [31], \({\text {ran}}({\mathfrak {C}})\) is contained in \(L^{2+}_{\omega ,Y}\), while \({\mathfrak {B}}\) extends to a unique continuous linear operator from \(L^{2-}_{\omega ,U}\) into X, and the restriction of \({\mathfrak {D}}\) to \(L^2_{\ell ,loc,U}\bigcap L^{2}_{\omega ,U}\) has a unique linear extension that maps \(L^{2}_{\omega ,U}\) continuously into \(L^{2}_{\omega ,Y}\). We can thus reinterpret the operators \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\) as

$$\begin{aligned} {\widetilde{{\mathfrak {B}}}}\in {{\mathcal {B}}}(L^{2-}_{\omega ,U},X), \quad {\widetilde{{\mathfrak {C}}}}\in {{\mathcal {B}}}(X,L^{2+}_{\omega ,Y}),\quad {\widetilde{{\mathfrak {D}}}}\in {{\mathcal {B}}}(L^2_{\omega ,U},L^2_{\omega ,Y}), \end{aligned}$$
(2.7)

and this reinterpretation can also be reversed, so that the original three operators can be recovered from their tilde versions. In case the operators \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\) can be reinterpreted in the above fashion as bounded operators as in (2.7), then we say that \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\) are \(\omega \)-bounded, respectively. Moreover, the \(C_0\)-semigroup \({\mathfrak {A}}^t\) is called \(\omega \)-bounded in case \(\sup _{t\geqslant 0}\Vert e^{-\omega t}{\mathfrak {A}}^t\Vert <\infty \).

The following proposition shows how the frequency-response-function approach at the beginning of the introduction can be used to define a transfer function for an infinite dimensional well-posed system \(\Sigma \) directly via the integrated system operators \({\mathfrak {A}}\), \({\mathfrak {B}}\), \({\mathfrak {C}}\), \({\mathfrak {D}}\), thereby avoiding completely the system node \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) to be discussed in §4.

Proposition 2.3

For a well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) and for all \(\omega >\omega _{\mathfrak {A}}\), \({\widetilde{{\mathfrak {D}}}}\) uniquely induces an operator \({\overline{{\mathfrak {D}}}}: H^1_{\omega ,loc}({{\mathbb {R}}};U) \rightarrow H^1_{\omega ,loc}({{\mathbb {R}}};Y)\), where

$$\begin{aligned} H^1_{\omega ,loc}({{\mathbb {R}}};K):=\left\{ f\in L^2_{loc,K}\mid \dot{f}\in L^2_{loc,K},~\pi _-f\in L^{2-}_{\omega ,K} \right\} , \end{aligned}$$
(2.8)

and the action of \({\overline{{\mathfrak {D}}}}\) is independent of \(\omega >\omega _{\mathfrak {A}}\). The transfer function \({\widehat{{\mathfrak {D}}}}\) of \(\Sigma \), given by

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}(\lambda )u_0:= ({\overline{{\mathfrak {D}}}} e_\lambda u_0)(0), \quad \lambda \in {{\mathbb {C}}}_{\omega _{\mathfrak {A}}},\ u_0\in U, \end{aligned}$$

is well-defined and when restricted to the half-plane \({{\mathbb {C}}}_{\omega }\), for \(\omega >\omega _{\mathfrak {A}}\), gives a function in \(H^\infty ({{\mathbb {C}}}_\omega ;{{\mathcal {B}}}(U,Y))\). Furthermore we recover the Laplace-transform interpretation (1.4) of \({\widehat{{\mathfrak {D}}}} (\lambda )\) as follows: for \({{\mathbf {u}}}\in L^{2+}_{\omega ,U}\) we have

$$\begin{aligned} \widehat{ {\mathfrak {D}}{{\mathbf {u}}}}(\lambda ) = {\widehat{{\mathfrak {D}}}}(\lambda ) {\widehat{{{\mathbf {u}}}}}(\lambda ),\quad \lambda \in {{\mathbb {C}}}_\omega . \end{aligned}$$
(2.9)

Proposition 2.3 follows from Lemmas 4.5.1, 4.5.3 and 4.6.2 and Corollary 4.6.10 together with Definition 4.6.1 in [31]. We emphasize that the domain of the transfer function defined in Proposition 2.3 is \({{\mathbb {C}}}_{\omega _{\mathfrak {A}}}\), and at the same time remind the reader that we identify two analytic functions agreeing on a set of points in the intersection of their respective domains having a common interior cluster point. The key starting point to the preceding proposition is that

$$\begin{aligned} \frac{\tau ^h{\overline{{\mathfrak {D}}}} {{\mathbf {u}}}-{\overline{{\mathfrak {D}}}} {{\mathbf {u}}}}{h}= {\overline{{\mathfrak {D}}}} \frac{\tau ^h{{\mathbf {u}}}-{{\mathbf {u}}}}{h}, \end{aligned}$$

due to time invariance; see the proof of [31,  Lemma 4.5.1].

Let us identify the spaces X, U and Y with their duals. Then the adjoints of the operators in (2.7) with respect to the appropriate duality pairings belong to the following spaces:

$$\begin{aligned} {\widetilde{{\mathfrak {B}}}}^*\in {{\mathcal {B}}}(X,L^{2-}_{-\omega ,U}), \quad {\widetilde{{\mathfrak {C}}}}^*\in {{\mathcal {B}}}(L^{2+}_{-\omega ,Y},X),\quad {\widetilde{{\mathfrak {D}}}}^*\in {{\mathcal {B}}}(L^2_{-\omega ,Y},L^2_{-\omega ,U}). \end{aligned}$$

Since \({\widetilde{{\mathfrak {B}}}}\), \({\widetilde{{\mathfrak {C}}}}\) and \({\widetilde{{\mathfrak {D}}}}\) are bounded linear Hilbert space operators, their adjoints are well defined. Noting that \(L^{2-}_{-\omega ,U}\subset L^{2-}_{loc,U}\) and \(L^{2+}_{r,Y}\subset L^{2+}_{-\omega ,Y}\), we can view the adjoints as operators of the following forms:

$$\begin{aligned} \begin{aligned} {\mathfrak {B}}^\circledast&:={\widetilde{{\mathfrak {B}}}}^*:X\rightarrow L^{2-}_{loc,U},\quad {\mathfrak {C}}^\circledast :={\widetilde{{\mathfrak {C}}}}^*\big |_{L^{2+}_{r,Y}}: L^{2+}_{r,Y}\rightarrow X,\\ {\mathfrak {D}}^\circledast&:={\widetilde{{\mathfrak {D}}}}^*|_{L^2_{r,loc,Y}}:L^2_{r,loc,Y}\rightarrow L^2_{r,loc,U}; \end{aligned} \end{aligned}$$
(2.10)

using [31,  Theorem 6.2.1], we indeed see that \({\widetilde{{\mathfrak {D}}}}^*\) has a restriction followed by an extension to an operator that maps \(L^2_{r,loc,Y}\) continuously into \(L^2_{r,loc,U}\). Using the reflection operator as in (1.18), we define the causal dual system \(\Sigma ^d\) of \(\Sigma \) via

(2.11)

where \({\mathfrak {A}}^*\) is the dual semigroup of \({\mathfrak {A}}\), i.e., \(({\mathfrak {A}}^*)^t=({\mathfrak {A}}^t)^*\), \(t\geqslant 0\). Here we depart from [10] by using the causal dual system instead of the anti-causal dual system, which would not have the reflections in (2.11). The reason is that we prefer to have all of the theory in [31] at our disposal.

Theorem 2.4

Let \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) be a well-posed system. Then the causal dual system \(\Sigma ^d\) of \(\Sigma \) is a well-posed system with input space Y, state space X and output space U. Moreover, the causal dual of \(\Sigma ^d\) is equal to \(\Sigma \) and the transfer function of \(\Sigma ^d\) is \({\widehat{{\mathfrak {D}}}}^d(\lambda )={\widehat{{\mathfrak {D}}}}({\overline{\lambda }})^*\), \({\overline{\lambda }}\in \rho (A)\), and in particular, \(\Vert {\widehat{{\mathfrak {D}}}}^d\Vert _{H^\infty ({{\mathbb {C}}}_\omega ;{{\mathcal {B}}}(Y,U))}=\Vert {\widehat{{\mathfrak {D}}}}\Vert _{H^\infty ({{\mathbb {C}}}_\omega ;{{\mathcal {B}}}(U,Y))}\) for all \(\omega >\omega _{\mathfrak {A}}\). If \(\Sigma \) is passive, then \(\Sigma ^d\) is passive too.

For the proof, see Theorems 6.2.3, 6.2.13 and Lemma 11.1.4 in [31].

Lemma 2.5

Let \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) be a well-posed system with causal dual system \(\Sigma ^d=\left[ {\begin{matrix}{\mathfrak {A}}^d&{}{\mathfrak {B}}^d\\ {\mathfrak {C}}^d&{}{\mathfrak {D}}^d \end{matrix}}\right] \). Define \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\) and \({\mathfrak {D}}^t\) as in (2.1) and define \(({\mathfrak {B}}^d)^t\), \(({\mathfrak {C}}^d)^t\) and \(({\mathfrak {D}}^d)^t\) analogously for the dual system \(\Sigma ^d\). Then

$$\begin{aligned} \begin{bmatrix} ({\mathfrak {A}}^d)^t &{} ({\mathfrak {B}}^d)^t\\ ({\mathfrak {C}}^d)^t &{} ({\mathfrak {D}}^d)^t \end{bmatrix}^* = \begin{bmatrix} 1_X &{} 0 \\ 0 &{} \Lambda ^t_Y \end{bmatrix}^* \begin{bmatrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t\\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{bmatrix} \begin{bmatrix} 1_X &{} 0 \\ 0 &{} \Lambda ^t_U \end{bmatrix},\quad t>0, \end{aligned}$$

where for a separable Banach space K we define \(\Lambda ^t_K\in {{\mathcal {B}}}(L^2([0,t],K))\) to be the unitary operator given by .

Proof

We need to prove that for each \(t>0\):

$$\begin{aligned} (({\mathfrak {A}}^d)^t)^*={\mathfrak {A}}^t,~ (({\mathfrak {C}}^d)^t)^*={\mathfrak {B}}^t \Lambda ^t_U,~ (({\mathfrak {B}}^d)^t)^*=(\Lambda ^t_Y)^* {\mathfrak {C}}^t,~ (({\mathfrak {D}}^d)^t)^*= (\Lambda ^t_Y)^* {\mathfrak {D}}^t \Lambda ^t_U. \end{aligned}$$

The first identity follows directly from the definition of \(({\mathfrak {A}}^d)^t\). Next note that

Thus, for all \(t>0\), \({{\mathbf {u}}}\in L^{2}([0,t],U)\) and \(x\in X\) we have

using that \({\mathfrak {B}}\) and \({\widetilde{{\mathfrak {B}}}}\) coincide on \(L^{2-}_{\ell ,U}\) in the last step. It thus follows for \(t>0\) and \({{\mathbf {u}}}\in L^{2}([0,t],U)\) that

and this proves the second identity. The third identity follows by an almost identical argument.

It remains to prove the last identity. For this purpose, let \({{\mathbf {y}}}\in L^{2}([0,t],Y)\) and \({{\mathbf {u}}}\in L^{2}([0,t],U)\). Then

It follows that

which proves the last identity. \(\square \)

The following notions will be important in the sequel:

Definition 2.6

A well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) is (approximately) controllable if the finite-time reachable subspace

$$\begin{aligned} {\text {Rea}}(\Sigma ):={\text {ran}}({\mathfrak {B}})=\text{ span }\left\{ {\text {ran}}((({\mathfrak {C}}^d)^t)^*) \mid t>0 \right\} \end{aligned}$$

is dense in X. Following [3], we say that the system \(\Sigma \) is (approximately) observable if the finite-time observable subspace

$$\begin{aligned} {\text {Obs}}(\Sigma ):=\text{ span }\left\{ {\text {ran}}(({\mathfrak {C}}^t)^*) \mid t>0 \right\} = {\text {ran}}({\mathfrak {B}}^d) \end{aligned}$$

is dense in X, and it is (approximately) minimal if it is both controllable and observable.

Note that the equalities in the definitions of \(\text{ Rea }\,(\Sigma )\) and \(\text{ Obs }\,(\Sigma )\) are dual, and that they follow directly from Lemma 2.5 and formulas (2.2), and that these equalities imply the following corollary:

Corollary 2.7

The well-posed system \(\Sigma \) is controllable (resp. observable) if and only if \(\Sigma ^d\) is observable (resp. controllable). In particular, \(\Sigma \) is minimal if and only if \(\Sigma ^d\) is minimal.

The following lemma shows that our definitions agree with the other common definitions of controllability and observability:

Lemma 2.8

The well-posed system \(\Sigma \) is controllable if and only if \({\mathfrak {B}}^d\) is one-to-one and observable if and only if \({\mathfrak {C}}\) is one-to-one.

Proof

We prove the statement regarding observability; for controllability the claim follows by duality. For \(x\in X\) we have

$$\begin{aligned} \begin{aligned} {\mathfrak {C}}x=0 \quad&\Longleftrightarrow \quad {\mathfrak {C}}^t x=\pi _{[0,t]} {\mathfrak {C}}x =0\quad \text{ for } \text{ all } t>0\\&\Longleftrightarrow \quad \left\langle {\mathfrak {C}}^t x , y \right\rangle =0 \quad \text{ for } \text{ all } t>0 \text{ and } y\in L^{2}([0,t],Y)\\&\Longleftrightarrow \quad x\perp {\text {ran}}(({\mathfrak {C}}^t)^*) \quad \text{ for } \text{ all } t>0\\&\Longleftrightarrow \quad x\perp \text{ Obs }\,(\Sigma ), \end{aligned} \end{aligned}$$

which proves our claim. \(\square \)

3 The \(L^2\)-input and \(L^2\)-output Maps of a Well-posed Linear System

The concepts of \(\ell ^2\)-exact controllability, \(\ell ^2\)-exact observability, and \(\ell ^2\)-exact minimality were recently introduced for discrete-time systems in [9]. We will now extend these concepts to well-posed continuous-time systems.

Define the (in general unbounded) \(L^2\)-output map as

$$\begin{aligned} \begin{aligned} {\mathbf {W}}_o := {\mathfrak {C}}\big |_{{\text {dom}}({\mathbf {W}}_o)}:X\supset {\text {dom}}({\mathbf {W}}_o )\rightarrow L^{2+}_Y,\\ \text {with }{\text {dom}}({\mathbf {W}}_o ) := \left\{ x\in X\mid {\mathfrak {C}}x\in L^{2+}_Y \right\} ; \end{aligned} \end{aligned}$$
(3.1)

i.e., we restrict \({\mathfrak {C}}\) to the \(x\in X\) that are mapped into \(L^{2+}_Y\), rather than into \(L^{2+}_{loc,Y}\), and view the resulting operator as mapping with codomain \(L^{2+}_Y\). Note that \({\text {ker}}({\mathbf {W}}_o)={\text {ker}}({\widetilde{{\mathfrak {C}}}})={\text {ker}}({\mathfrak {C}})\) and hence \(\Sigma \) is observable if and only if \({\mathbf {W}}_o \) is one to one, or equivalently, if and only if \({\widetilde{{\mathfrak {C}}}}\) is one-to-one.

Proposition 3.1

Let \({\mathbf {W}}_o\) be the \(L^2\)-output map of a well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \). Then \({\mathbf {W}}_o\) is closed. Additionally assume that \({\mathbf {W}}_o\) is densely defined. In this case:

  1. (1)

    The operator \({\mathbf {W}}_o\) has a closed and densely defined adjoint \({\mathbf {W}}_o^*\).

  2. (2)

    A function \({{\mathbf {y}}}\in L^{2+}_Y\) lies in \({\text {dom}}({\mathbf {W}}_o^*)\) if and only if there exists an \(x_o\in X\) such that

    (3.2)

    When \({{\mathbf {y}}}\in {\text {dom}}({\mathbf {W}}_o^*)\), we have \({\mathbf {W}}_o^* {{\mathbf {y}}}=x_o\), where \(x_o\) is given by (3.2).

  3. (3)

    It holds that \(L^{2+}_{r,Y}\subset {\text {dom}}({\mathbf {W}}_o^*)\), that , and that \({\mathbf {W}}_o^* L^{2+}_{r,Y}= {\text {ran}}({\mathfrak {B}}^d)=\text{ Obs }\,(\Sigma )\).

  4. (4)

    For all \(s>0\) and \({{\mathbf {y}}}\in {\text {dom}}({\mathbf {W}}_o^*)\) we have

    $$\begin{aligned} \tau ^{-s} {\text {dom}}({\mathbf {W}}_o^*) \subset {\text {dom}}({\mathbf {W}}_o^*),\quad {\mathbf {W}}_o^* \tau ^{-s}{{\mathbf {y}}}=({\mathfrak {A}}^{s})^* {\mathbf {W}}_o^* {{\mathbf {y}}}. \end{aligned}$$

Before giving the proof, we remark that by Lemma 2.5, the limit in (3.2) can be rewritten as

(3.3)

because the expressions inside of the limit operators are the same.

Proof

To see that \({\mathbf {W}}_o\) is closed, let \({\text {dom}}({\mathbf {W}}_o) \ni x_k\rightarrow x\) in X and \({\mathbf {W}}_o x_k \rightarrow {{\mathbf {y}}}\) in \(L^{2+}_Y\). Fix \(b>0\) arbitrarily and observe that \(\pi _{[0,b]}{\mathfrak {C}}\) is a bounded operator from X to \(L^{2_+}_{Y}\), by the well-posedness of \(\Sigma \). Hence

$$\begin{aligned} \pi _{[0,b]} {\mathfrak {C}}x = \lim _{k\rightarrow \infty } \pi _{[0,b]} {\mathfrak {C}}x_k = \lim _{k\rightarrow \infty } \pi _{[0,b]} {\mathbf {W}}_o x_k = \pi _{[0,b]} {{\mathbf {y}}}. \end{aligned}$$

Now let \(b\rightarrow \infty \) to get that \({\mathfrak {C}}x={{\mathbf {y}}}\in L^{2+}_{Y}\). This shows that \(x\in {\text {dom}}({\mathbf {W}}_o)\) and \({\mathbf {W}}_o x ={{\mathbf {y}}}\). Hence \({\mathbf {W}}_o\) is closed, as claimed.

In the remainder of the proof we assume that \({\text {dom}}({\mathbf {W}}_o)\) is dense in X and we prove items (1)–(4). Note that item (1) follows directly from [28,  Theorems 13.9 and 13.12], since \({\mathbf {W}}_o\) is closed and densely defined.

We now proceed with the explicit characterization of \({\mathbf {W}}_o^*\) given in item (2). Let \(x\in {\text {dom}}({\mathbf {W}}_o)\) and \({{\mathbf {y}}}\in L^{2+}_Y\). We have \({\mathfrak {C}}x={\mathbf {W}}_o x\in L^{2+}_Y\). Hence

$$\begin{aligned} \left\langle {\mathbf {W}}_o x , {{\mathbf {y}}} \right\rangle _{L^{2+}_Y}&= \left\langle {\mathfrak {C}}x , {{\mathbf {y}}} \right\rangle _{L^{2+}_Y} \!\!=\! \lim _{t\rightarrow \infty } \left\langle \pi _{[0,t]}{\mathfrak {C}}x , {{\mathbf {y}}} \right\rangle _{L^{2+}_{Y}} \!\!=\! \lim _{t\rightarrow \infty } \left\langle {\mathfrak {C}}^t x , \pi _{[0,t]}{{\mathbf {y}}} \right\rangle _{L^{2}([0,t],Y)}\\&=\lim _{t\rightarrow \infty } \left\langle x , ({\mathfrak {C}}^t)^* \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _X. \end{aligned}$$

Then \({{\mathbf {y}}}\in {\text {dom}}({\mathbf {W}}_o^*)\) if and only if there exists an \(x_0\in X\), such that for all \(x\in {\text {dom}}({\mathbf {W}}_o)\), we have

$$\begin{aligned} \left\langle x , x_0 \right\rangle _X=\left\langle {\mathbf {W}}_o x , {{\mathbf {y}}} \right\rangle _{L^{2+}_Y}=\lim _{t\rightarrow \infty } \left\langle x , ({\mathfrak {C}}^t)^* \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _X. \end{aligned}$$

This proves item (2), and we next prove item (3).

In case \({{\mathbf {y}}}\in L^{2+}_{r,Y}\), say \({\text {supp}}({{\mathbf {y}}})\subset [0,T]\), then \(({\mathfrak {C}}^t)^*\pi _{[0,t]}{{\mathbf {y}}}\) is independent of t for \(t>T\) and thus \(x_0:=\lim _{t\rightarrow \infty } ({\mathfrak {C}}^t)^*\pi _{[0,t]}{{\mathbf {y}}}=({\mathfrak {C}}^T)^*{{\mathbf {y}}}\) exists and satisfies (3.2) by (3.3). Hence \(L^{2+}_{r,Y}\subset {\text {dom}}({\mathbf {W}}_o^*)\) and for \({{\mathbf {y}}}\in L^{2+}_{r,Y}\), it by (3.3) holds that

and then clearly

this proves all of item (3).

Finally, we prove item (4). Fix \(s,t>0\), \(x\in X\) and \({{\mathbf {y}}}\in {\text {dom}}({\mathbf {W}}_o^*)\subset L^{2+}_Y\). Then we have

$$\begin{aligned}&\left\langle x , ({\mathfrak {C}}^{t+s})^* \pi _{[0,t+s]} \tau ^{-s} {{\mathbf {y}}} \right\rangle _X\\&\quad =\left\langle x , ({\mathfrak {C}}^{t+s})^* \tau ^{-s} \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _X\\&\quad = \left\langle \tau _+^s{\mathfrak {C}}^{t+s}x , \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _{L^{2}([0,t],Y)} = \left\langle \tau ^s_+\pi _{[0,s+t]}{\mathfrak {C}}x , \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _{L^{2}([0,t],Y)}\\&\quad = \left\langle \pi _{[0,t]}\tau _+^s{\mathfrak {C}}x , \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _{L^{2}([0,t],Y)} = \left\langle \pi _{[0,t]}{\mathfrak {C}}{\mathfrak {A}}^s x , \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _{L^{2}([0,t],Y)}\\&\quad = \left\langle {\mathfrak {C}}^t {\mathfrak {A}}^s x , \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _{L^{2}([0,t],Y)} = \left\langle {\mathfrak {A}}^s x , ({\mathfrak {C}}^t)^* \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _X. \end{aligned}$$

Moreover, for \(x\in {\text {dom}}({\mathbf {W}}_o)\), we have \({\mathfrak {A}}^sx\in {\text {dom}}({\mathbf {W}}_o)\), since \({\mathfrak {C}}{\mathfrak {A}}^sx=\tau _+^s{\mathfrak {C}}x\in L^{2+}_Y\). Using all of this, we find for \(x\in {\text {dom}}({\mathbf {W}}_o)\) and \(x_o\in X\) satisfying (3.2) that

$$\begin{aligned} \left\langle x , ({\mathfrak {A}}^s)^* x_o \right\rangle _X&= \left\langle {\mathfrak {A}}^s x , x_o \right\rangle _X = \lim _{t\rightarrow \infty } \left\langle {\mathfrak {A}}^s x , ({\mathfrak {C}}^t)^* \pi _{[0,t]} {{\mathbf {y}}} \right\rangle _X\\&= \lim _{t\rightarrow \infty } \left\langle x , ({\mathfrak {C}}^{t+s})^* \pi _{[0,t+s]} \tau ^{-s} {{\mathbf {y}}} \right\rangle _X = \lim _{r\rightarrow \infty } \left\langle x , ({\mathfrak {C}}^{r})^* \pi _{[0,r]} \tau ^{-s} {{\mathbf {y}}} \right\rangle _X. \end{aligned}$$

Since the limit exists for every \(x\in {\text {dom}}({\mathbf {W}}_o)\), it follows that \(\tau ^{-s} {{\mathbf {y}}}\in {\text {dom}}({{\mathbf {W}}}_o^*)\) and \({{\mathbf {W}}}_o^*\tau ^{-s} {{\mathbf {y}}}=({\mathfrak {A}}^s)^* x_o= ({\mathfrak {A}}^s)^* {{\mathbf {W}}}_o^* {{\mathbf {y}}}\), which proves item (4). \(\square \)

The \(L^2\)-input map is defined similarly, via the causal dual system. We first define the adjoint \(L^2\)-input map \({\mathbf {W}}_c^\bigstar \), using \(\bigstar \) to indicate that \({\mathbf {W}}_c^\bigstar \) is defined directly and not as the adjoint of an operator \({\mathbf {W}}_c\):

(3.4)

Defining \({\mathbf {W}}_o^d\) and \({\mathbf {W}}_c^{d\bigstar }\) similarly as in (3.1) and (3.4), respectively, for the causal dual system \(\Sigma ^d\), one obtains

(3.5)

and in particular, \(\Sigma \) is minimal if and only if \({\mathbf {W}}_o\) and \({\mathbf {W}}_c^\bigstar \) are both injective.

By duality, from Proposition 3.1, we obtain the following result:

Proposition 3.2

Let \({\mathbf {W}}_c^\bigstar \) be the adjoint \(L^2\)-input map of a well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \). Then \({\mathbf {W}}_c^\bigstar \) is closed. Additionally assume that \({\mathbf {W}}_c^\bigstar \) is densely defined. In this case:

  1. (1)

    The operator \({\mathbf {W}}_c^\bigstar \) has a closed and densely defined adjoint, denoted by \({\mathbf {W}}_c\), such that \({\mathbf {W}}_c^\bigstar ={\mathbf {W}}_c^*\).

  2. (2)

    A function \({{\mathbf {u}}}\in L^{2-}_U\) lies in \({\text {dom}}({\mathbf {W}}_c)\) if and only if there exists an \(x_c\in X\) such that

    $$\begin{aligned} \lim _{t\rightarrow \infty }\left\langle x , {\mathfrak {B}}\pi _{[-t,0]} {{\mathbf {u}}} \right\rangle _X =\left\langle x , x_c \right\rangle _X,\quad x\in {\text {dom}}({\mathbf {W}}_c^\bigstar ). \end{aligned}$$
    (3.6)

    When \({{\mathbf {u}}}\in {\text {dom}}({\mathbf {W}}_c)\), we have \({\mathbf {W}}_c {{\mathbf {u}}}=x_c\), where \(x_c\) is given by (3.6).

  3. (3)

    It holds that \(L^{2-}_{\ell ,U}\subset {\text {dom}}({\mathbf {W}}_c)\), that \({\mathbf {W}}_c\big |_{L^{2-}_{\ell ,U}}={\mathfrak {B}}\), and that \({\mathbf {W}}_c L^{2-}_{\ell ,U}= {\text {ran}}({\mathfrak {B}})=\text{ Rea }\,(\Sigma )\).

  4. (4)

    For all \(s>0\) we have \(\tau ^{s} {\text {dom}}({\mathbf {W}}_c) \subset {\text {dom}}({\mathbf {W}}_c)\) and \({\mathbf {W}}_c \tau ^{s}{{\mathbf {u}}}={\mathfrak {A}}^{s} {\mathbf {W}}_c {{\mathbf {u}}}\) for all \({{\mathbf {u}}}\in {\text {dom}}({\mathbf {W}}_c)\).

Again, it holds that

We have the following easy corollary:

Corollary 3.3

Assume that the adjoint \(L^2\)-input map \({\mathbf {W}}_c^\bigstar \) of a well-posed system \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) is densely defined. For every system trajectory \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) on \({{\mathbb {R}}}\), we have \(\pi _-{{\mathbf {u}}}\in {\text {dom}}({\mathbf {W}}_c)\) and \({{\mathbf {x}}}(0)={\mathbf {W}}_c\pi _-{{\mathbf {u}}}\).

Proof

Let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be a trajectory of \(\Sigma \) on \({{\mathbb {R}}}\). By Definition 2.2, we then have \(\pi _-{{\mathbf {u}}}\in L^{2-}_{\ell ,U}\subset {\text {dom}}({\mathfrak {B}})\subset {\text {dom}}({\mathbf {W}}_c)\). By item (3) of Proposition 3.2 and (2.4), \({{\mathbf {x}}}(0)={\mathfrak {B}}\pi _-{{\mathbf {u}}}={\mathbf {W}}_c\pi _-{{\mathbf {u}}}\). \(\square \)

In the remainder of this section we shall assume that \({\widehat{{\mathfrak {D}}}}\big |_{{\text {dom}}({\widehat{{\mathfrak {D}}}})\bigcap {{{\mathbb {C}}}^{+}}}\) has a unique analytic extension to a function in \(H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\), also denoted by \({\widehat{{\mathfrak {D}}}}\). With our convention to identify analytic functions that coincide on some set with an interior cluster point, we simply write \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\). In that case, \({\widehat{{\mathfrak {D}}}}\) defines a bounded pointwise multiplication operator

$$\begin{aligned} M_{{\widehat{{\mathfrak {D}}}}}:L^2(i{{\mathbb {R}}};U) \rightarrow L^2(i{{\mathbb {R}}};Y),\quad (M_{{\widehat{{\mathfrak {D}}}}} f)(\lambda )= {\widehat{{\mathfrak {D}}}} (\lambda ) f(\lambda ), \quad \lambda \in i{{\mathbb {R}}}, \end{aligned}$$
(3.7)

with operator norm \(\Vert M_{{\widehat{{\mathfrak {D}}}}} \Vert \) equal to the supremum norm \(\Vert {\widehat{{\mathfrak {D}}}}\Vert _\infty \) of \({\widehat{{\mathfrak {D}}}}\) over \({{{\mathbb {C}}}^{+}}\). Further, let \({{\mathcal {L}}}:L^2({{\mathbb {R}}};K)\rightarrow L^2(i{{\mathbb {R}}};K)\) denote the unitary bilateral Laplace transform

$$\begin{aligned} ({{\mathcal {L}}}{{\mathbf {u}}})(\lambda )=\int _{-\infty }^\infty e^{-\lambda t}\,{{\mathbf {u}}}(t)\,{\mathrm {d}}t,\quad \lambda \in i{{\mathbb {R}}}, \end{aligned}$$

which in particular maps \(L^{2+}_K\) unitarily onto \(H^{2+}_K:=H^2({{{\mathbb {C}}}^{+}};K)\). We then define the \(L^2\)-transfer map \(L_\Sigma \) by

$$\begin{aligned} L_\Sigma :={{\mathcal {L}}}^*M_{{\widehat{{\mathfrak {D}}}}}{{\mathcal {L}}}\in {{\mathcal {B}}}(L^2_U,L^2_Y). \end{aligned}$$
(3.8)

We now derive various properties of this operator.

Theorem 3.4

Let \(\Sigma \) be a well-posed linear system with transfer function \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\). The following statements are true:

  1. (1)

    The operator \(L_\Sigma \) in (3.8) is the unique continuous linear extension to an operator in \({{\mathcal {B}}}(L^2_U,L^2_Y)\) of the restriction of \({\mathfrak {D}}\) to \(L^2_{\ell ,U}\). Moreover, we have \(\Vert L_\Sigma \Vert =\Vert {\widehat{{\mathfrak {D}}}}\Vert _\infty \) and \(L_\Sigma \) is causal, i.e., \(\pi _- L_\Sigma \pi _+=0\), and time-invariant, i.e., \(\tau ^t L_\Sigma =L_\Sigma \tau ^t\) for all \(t\in {{\mathbb {R}}}\).

  2. (2)

    It holds that \({\text {ran}}({\mathfrak {B}})\subset {\text {dom}}({\mathbf {W}}_o)\). The restriction to \(L^{2-}_{\ell ,U}\) of the Hankel operator \(\pi _+{\mathfrak {D}}\pi _-\) has a unique extension to an operator in \({{\mathcal {B}}}(L^{2-}_U,L^{2+}_Y)\), which equals

    $$\begin{aligned} {\mathfrak {H}}_\Sigma :=\pi _+L_\Sigma \big |_{L^{2-}_U}\ \ \text{ and } \text{ satisfies }\ \ \Vert {\mathfrak {H}}_\Sigma \Vert \leqslant \Vert {\widehat{{\mathfrak {D}}}}\Vert _\infty ,\quad {\mathfrak {H}}_\Sigma |_{L^{2-}_{\ell ,U}}={\mathbf {W}}_o {\mathfrak {B}}. \end{aligned}$$
    (3.9)
  3. (3)

    For the causal dual system \(\Sigma ^d\) we have \({\widehat{{\mathfrak {D}}}}^d\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(Y,U))\), the unique extension \(L_{\Sigma ^d}\) in \({{\mathcal {B}}}(L^{2}_Y,L^{2}_U)\) of \({\mathfrak {D}}^d\) restricted to \(L^2_{\ell ,Y}\) satisfies

    (3.10)

    and the \(L^2\)-analogue of the Hankel operator of the causal dual is . Moreover, we have \({\text {ran}}({\mathfrak {B}}^d)\subset {\text {dom}}({\mathbf {W}}_c^\bigstar )\) and

    (3.11)
  4. (4)

    Furthermore, if \({\text {dom}}({\mathbf {W}}_c ^\bigstar )\) is dense in X, then \({\text {ran}}({\mathbf {W}}_c ) \subset {\text {dom}}({\mathbf {W}}_o )\) and

    $$\begin{aligned} {\mathfrak {H}}_\Sigma \big |_{{\text {dom}}({\mathbf {W}}_c )}={\mathbf {W}}_o {\mathbf {W}}_c. \end{aligned}$$
    (3.12)

    If \({\text {dom}}({\mathbf {W}}_o)\) is dense in X, then \({\text {ran}}({\mathbf {W}}_o^* ) \subset {\text {dom}}({\mathbf {W}}_c^\bigstar )\) and

    $$\begin{aligned} {\mathfrak {H}}_\Sigma ^*\big |_{{\text {dom}}({\mathbf {W}}_o^* )}={\mathbf {W}}_c^\bigstar {\mathbf {W}}_o^*. \end{aligned}$$
    (3.13)

Proof

Since \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\), the operator \(L_\Sigma \) maps \(L^{2+}_U\) into \(L^{2+}_Y\); hence \(L_\Sigma \) is causal. Moreover, for every \(\omega \in {{\mathbb {R}}}\), since \(M_{{\widehat{{\mathfrak {D}}}}}\) intertwines \(M_{e_\omega 1_U}\) and \(M_{e_\omega 1_Y}\), where \((e_\omega 1_K)(z)=e^{\omega z} 1_K \), we get that \(L_\Sigma \) commutes with \(\tau ^t\) (suppressing the spaces U and Y in the notation); hence \(L_\Sigma \) is time invariant. Now let \({{\mathbf {u}}}\in L^2_U\) have \({\text {supp}}({{\mathbf {u}}})\subset [N,\infty )\) for some \(N\in {{\mathbb {R}}}\). Then \({{\mathbf {u}}}\in {\text {dom}}(L_\Sigma )\bigcap {\text {dom}}({\mathfrak {D}})\) and \(\tau ^N{{\mathbf {u}}}\in L^{2+}_U\subset L^{2+}_{\omega ,U}\) for \(\omega >\min \,\{0,\omega _{\mathfrak {A}}\}\). By [31,  Corollary 4.6.10(iii)] we have \(M_{{\widehat{{\mathfrak {D}}}}} {{\mathcal {L}}}(\tau ^N {{\mathbf {u}}})={{\mathcal {L}}}({\mathfrak {D}}\pi _+ \tau ^N {{\mathbf {u}}})={{\mathcal {L}}}({\mathfrak {D}}\tau ^N {{\mathbf {u}}})\). Hence

$$\begin{aligned} \tau ^N L_\Sigma {{\mathbf {u}}}= L_\Sigma \tau ^N {{\mathbf {u}}}= {{\mathcal {L}}}^*M_{{\widehat{{\mathfrak {D}}}}}{{\mathcal {L}}}(\tau ^N{{\mathbf {u}}})= {{\mathcal {L}}}^* {{\mathcal {L}}}({\mathfrak {D}}\tau ^N {{\mathbf {u}}})= {\mathfrak {D}}\tau ^N {{\mathbf {u}}}= \tau ^N {\mathfrak {D}}{{\mathbf {u}}}. \end{aligned}$$

It follows that \(L_\Sigma {{\mathbf {u}}}= {\mathfrak {D}}{{\mathbf {u}}}\) for every \({{\mathbf {u}}}\in L^2_{\ell ,U}\). Since the latter subspace is dense in \(L^2_U\), the only extension to a bounded linear operator on \(L^2_U\) of the restriction of \({\mathfrak {D}}\) to \(L^2_{\ell ,U}\) is \(L_\Sigma \). Since \({\mathcal {L}}\) is unitary, we have \(\Vert L_\Sigma \Vert =\Vert M_{{\widehat{{\mathfrak {D}}}}}\Vert =\Vert {\widehat{{\mathfrak {D}}}}\Vert _\infty \). This proves item (1).

By (3.9) and item (1), the operator \({\mathfrak {H}}_\Sigma \) coincides with \(\pi _+{\mathfrak {D}}\pi _-\) on \(L^{2-}_{\ell ,U}\), and hence \({\mathfrak {H}}_\Sigma \) is the unique extension to an operator in \({{\mathcal {B}}}(L^{2-}_U,L^{2+}_Y)\) of \(\pi _+{\mathfrak {D}}\pi _-\) restricted to \(L^{2-}_{\ell ,U}\). Observing that \(\pi _+\) is a contraction on \(L^2_K\), we obtain that \(\Vert {\mathfrak {H}}_\Sigma \Vert \leqslant \Vert L_\Sigma \Vert =\Vert {\widehat{{\mathfrak {D}}}}\Vert _\infty \). To see that the factorization of \({\mathfrak {H}}_\Sigma |_{L^{2-}_{\ell ,U}}\) in (3.9) holds, let \({{\mathbf {u}}}\in L^{2-}_{\ell ,U}\) and note that Definition 2.1.4(c) gives that \({\mathfrak {C}}{\mathfrak {B}}{{\mathbf {u}}}=\pi _+{\mathfrak {D}}\pi _- {{\mathbf {u}}}= {\mathfrak {H}}_\Sigma {{\mathbf {u}}}\), which is in \(L^{2+}_Y\) by the boundedness of \({\mathfrak {H}}_\Sigma \). Hence \({\mathfrak {B}}{{\mathbf {u}}}\in {\text {dom}}({\mathbf {W}}_o)\) and \({\mathfrak {H}}_\Sigma {{\mathbf {u}}}= {\mathbf {W}}_o {\mathfrak {B}}{{\mathbf {u}}}\). This establishes item (2).

That \({\widehat{{\mathfrak {D}}}}^d\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(Y,U))\) follows directly from \({\widehat{{\mathfrak {D}}}} ^d(\lambda )={\widehat{{\mathfrak {D}}}}({\overline{\lambda }})^*\) in Theorem 2.4. By item (2) of the present theorem, which has already been proved, the restriction of to \(L_{\ell ,Y}^2\) has a unique extension to \(L_{\Sigma ^d}\in {{\mathcal {B}}}(L^{2}_Y,L^{2}_U)\). Moreover, , because for all \({{\mathbf {u}}}\in L^{2}_{\ell ,U}\), \({{\mathbf {y}}}\in L^{2}_{r,Y}\) and some \(\omega >\max \left\{ 0,\omega _{\mathfrak {A}} \right\} \),

$$\begin{aligned} \left\langle L_\Sigma ^* {{\mathbf {y}}} , {{\mathbf {u}}} \right\rangle _{L^{2}_U}&= \left\langle {{\mathbf {y}}} , {\mathfrak {D}}{{\mathbf {u}}} \right\rangle _{L^{2}_Y} = \left\langle {{\mathbf {y}}} , {\widetilde{{\mathfrak {D}}}} {{\mathbf {u}}} \right\rangle _{L^{2}_{-\omega ,Y},L^{2}_{\omega ,Y}}\\&= \left\langle {\widetilde{{\mathfrak {D}}}}^*{{\mathbf {y}}} , {{\mathbf {u}}} \right\rangle _{L^{2}_{-\omega ,U},L^{2}_{\omega ,U}} = \left\langle {\mathfrak {D}}^\circledast {{\mathbf {y}}} , {{\mathbf {u}}} \right\rangle _{L^{2}_U}, \end{aligned}$$

so that \(L_\Sigma ^*\) and \({\mathfrak {D}}^\circledast \) coincide on \(L^{2}_{r,Y}\) by the density of \(L^2_{\ell ,U}\) in \(L^2_U\); then also and coincide on \(L^2_{\ell ,Y}\), so that . Letting \(\iota _\pm :L^{2\pm }_K\rightarrow L^2_K\) denote the injection, we can write \({\mathfrak {H}}_{\Sigma }=\pi _+L_{\Sigma }\iota _-\), and then \({\mathfrak {H}}_{\Sigma }^*=\pi _-L_{\Sigma }^*\iota _+\), so that

(3.14)

Now (3.11) follows from (3.14) and (3.9), using the first identity in (3.5), and hence item (3) is true.

Now assume that \({\text {dom}}({\mathbf {W}}_c ^\bigstar )\) is dense in X, hence \({\mathbf {W}}_c\), the adjoint of \({\mathbf {W}}_c ^\bigstar \), is closed and densely defined. From item (3) in Proposition 3.2 and (3.9), it follows that \({\mathfrak {H}}_\Sigma \) and \({\mathbf {W}}_o {\mathbf {W}}_c\) coincide on \(L^{2-}_{\ell ,U}\). We now show that \({\text {ran}}( {\mathbf {W}}_c)\subset {\text {dom}}({\mathbf {W}}_o)\) and that \({\mathfrak {H}}_\Sigma \) and \({\mathbf {W}}_o {\mathbf {W}}_c\) also coincide on \({\text {dom}}({\mathbf {W}}_c)\). Let \({{\mathbf {u}}}\in {\text {dom}}({\mathbf {W}}_c)\subset L^{2-}_U\) and \(x_c= {\mathbf {W}}_c {{\mathbf {u}}}\in {\text {ran}}({\mathbf {W}}_c)\). Choose \(T>0\) and \({{\mathbf {y}}}\in L^{2}([0,T];Y)\) arbitrarily. Then Lemma 2.5 and item (3) yield

$$\begin{aligned} ({\mathfrak {C}}^T)^*{{\mathbf {y}}}=({\mathfrak {B}}^d)^T(\Lambda ^t_K)^*{{\mathbf {y}}}\in {\text {dom}}({\mathbf {W}}_c^\bigstar ), \end{aligned}$$

while item (2) of Proposition 3.2 and the boundedness of \({\mathfrak {C}}^T\) give

$$\begin{aligned} \left\langle {{\mathbf {y}}} , {\mathfrak {C}}^T x_c \right\rangle _{L^{2+}_Y}&=\lim _{t\rightarrow \infty } \left\langle ({\mathfrak {C}}^T)^* {{\mathbf {y}}} , {\mathfrak {B}}\pi _{[-t,0]}{{\mathbf {u}}} \right\rangle _X =\lim _{t\rightarrow \infty } \left\langle {{\mathbf {y}}} , \pi _{[0,T]}{\mathfrak {C}}{\mathfrak {B}}\pi _{[-t,0]}{{\mathbf {u}}} \right\rangle _{L^{2}([0,T];Y)}\\&=\lim _{t\rightarrow \infty } \left\langle {{\mathbf {y}}} , \pi _{[0,T]} {\mathfrak {H}}_\Sigma \pi _{[-t,0]}{{\mathbf {u}}} \right\rangle _{L^{2}([0,T];Y)} =\left\langle {{\mathbf {y}}} , \pi _{[0,T]} {\mathfrak {H}}_\Sigma {{\mathbf {u}}} \right\rangle _{L^{2}([0,T];Y)}, \end{aligned}$$

using the boundedness of \({\mathfrak {H}}_\Sigma \) in the last identity. Since the above computation holds for all \({{\mathbf {y}}}\) and all T, we have \(\pi _{[0,T]}{\mathfrak {C}}x_c={\mathfrak {C}}^T x_c=\pi _{[0,T]} {\mathfrak {H}}_\Sigma {{\mathbf {u}}}\) for all \(T>0\). This shows that \({\mathfrak {C}}x_c = {\mathfrak {H}}_\Sigma {{\mathbf {u}}}\in L^{2+}_Y\). In particular, we have \(x_c\in {\text {dom}}({\mathbf {W}}_o)\) and \({\mathbf {W}}_o {\mathbf {W}}_c {{\mathbf {u}}}= {\mathbf {W}}_o x_c= {\mathfrak {C}}x_c={\mathfrak {H}}_\Sigma {{\mathbf {u}}}\). Equality (3.13) is obtained by applying (3.12) to \(\Sigma ^d\), using that , as proved above, and the identities in (3.5). \(\square \)

Corollary 3.5

Let \(\Sigma \) be a well-posed system with \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\). If \(\Sigma \) is controllable, then \({\mathbf {W}}_o\) is densely defined; if \(\Sigma \) is observable, then \({\mathbf {W}}_c^\bigstar \) is densely defined.

Proof

By Theorem 3.4, the finite-time reachable subspace \(\text{ Rea }\,(\Sigma )={\text {ran}}({\mathfrak {B}})\) is contained in \({\text {dom}}({\mathbf {W}}_o)\) and the finite time observable subspace \(\text{ Obs }\,(\Sigma )={\text {ran}}({\mathfrak {B}}^d)\) is contained in \({\text {dom}}({\mathbf {W}}_c^\bigstar )\). Thus the claim follows directly from Definition 2.6. \(\square \)

We now present two cases where the \(L^2\)-input and \(L^2\)-output map are both bounded.

Lemma 3.6

For a well-posed system \(\Sigma \), the following hold:

  1. (1)

    If \(\Sigma \) is exponentially stable, then \({\mathbf {W}}_c\in {{\mathcal {B}}}(L^{2-}_U,X)\) and \({\mathbf {W}}_o\in {{\mathcal {B}}}(X,L^{2+}_Y)\).

  2. (2)

    If \(\Sigma \) is passive, then \({\mathbf {W}}_c\) and \({\mathbf {W}}_o\) are everywhere-defined contractions.

Proof

Concerning item (1), if \(\Sigma \) is exponentially stable, then \(\omega _{\mathfrak {A}}<0\) so that we can choose \(\omega =0\) in order to obtain from (2.7) that \({\widetilde{{\mathfrak {C}}}}\in {{\mathcal {B}}}(X,L^{2+}_Y)\) and \({\widetilde{{\mathfrak {B}}}}\in {{\mathcal {B}}}(L^{2-}_U,X)\). Then \({\mathbf {W}}_o ={\widetilde{{\mathfrak {C}}}}\) and \({\mathbf {W}}_c ^\bigstar ={\widetilde{{\mathfrak {B}}}}^*\) are bounded, too, and we have \({\mathbf {W}}_c=({\mathbf {W}}_c^\bigstar )^*\in {{\mathcal {B}}}(L^{2-}_U,X)\).

For item (2), note that a passive system satisfies (1.5) with \(S(x)=\Vert x\Vert ^2_X\) by definition. For trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) on \({{{\mathbb {R}}}^{+}}\) with \({{\mathbf {u}}}=0\), we in particular obtain \(\int _0^t\Vert {{\mathbf {y}}}(s)\Vert ^2\,{\mathrm {d}}s\leqslant \Vert {{\mathbf {x}}}(0)\Vert ^2\), and letting \(t\rightarrow \infty \), we get \({{\mathbf {y}}}\in L^{2+}_Y\). Moreover, by (2.3) and the definition (3.1) of \({\mathbf {W}}_o\) we have \(\Vert {{\mathbf {y}}}\Vert _{L^{2+}_Y}^2=\Vert {\mathbf {W}}_o{{\mathbf {x}}}(0)\Vert _{L^{2+}_Y}^2\leqslant \Vert {{\mathbf {x}}}(0)\Vert _X^2\). This proves that \({\mathbf {W}}_o\) is an everywhere-defined contraction, and applying the same argument to the passive dual \(\Sigma ^d\), using (3.5), gives that \({\mathbf {W}}_c^\bigstar \) is a contraction, hence \({\mathbf {W}}_c\) is a well-defined contraction, too. \(\square \)

The following definition presents the analogues of exact \(\ell ^2\)-controllability and exact \(\ell ^2\)-observability from [9] in the context of well-posed systems.

Definition 3.7

The well-posed system \(\Sigma \) is (exactly) \(L^2\)-controllable if \({\mathbf {W}}_c ^\bigstar \) is densely defined and \({\text {ran}}({\mathbf {W}}_c )=X\). The system \(\Sigma \) is (exactly) \(L^2\)-observable if \({\mathbf {W}}_o \) is densely defined and \({\text {ran}}({\mathbf {W}}_o ^*)=X\). The system \(\Sigma \) is (exactly) \(L^2\)-minimal if it is both \(L^2\)-controllable and \(L^2\)-observable.

By (3.5), \(\Sigma \) is \(L^2\)-controllable (\(L^2\)-observable) if and only if \(\Sigma ^d\) is \(L^2\)-observable (\(L^2\)-controllable). Some differences between \(\ell ^2\)-controllability/observability and approximate controllability/observability for discrete-time systems are described in [9,  Proposition 2.7]; here we prove analogous results in the present context, and we also provide new information on these relationships.

Corollary 3.8

For each well-posed system \(\Sigma \) as in Definition 2.1, \(L^2\)-controllability (\(L^2\)-observability) implies (approximate) controllability (observability). In particular, \(L^2\)-minimality of \(\Sigma \) implies minimality of \(\Sigma \). When we additionally assume that \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\), the following statements are true:

  1. (1)

    If \(\Sigma \) is \(L^2\)-controllable then \({\mathbf {W}}_o\) is bounded.

  2. (2)

    If \(\Sigma \) is \(L^2\)-observable then \({\mathbf {W}}_c\) is bounded.

  3. (3)

    If \(\Sigma \) is \(L^2\)-minimal then \({\mathbf {W}}_c^* \) and \({\mathbf {W}}_o\) are both bounded and bounded below.

Hence, the assumptions on denseness of the domains of \({\mathbf {W}}_c ^\bigstar \) and \({\mathbf {W}}_o\) impose no restriction in the study of the bounded real lemma, since in the standard version (Theorem 1.9) we assume minimality (or even \(L^2\)-minimality in Theorem 1.10) and in the strict version (Theorem 1.12) we assume exponential stability; see Lemma 3.6.

Proof of Corollary 3.8

Assume that \(\Sigma \) is \(L^2\)-observable; then by Definition 3.7, \({\text {dom}}({\mathbf {W}}_o)\) is dense in X and \({\text {ran}}({\mathbf {W}}_o^*)=X\). Since \({\mathbf {W}}_o\) is closed, the comment after (3.1) gives that \(\Sigma \) is (approximately) observable. If instead \(\Sigma \) is \(L^2\)-controllable, then \(\Sigma ^d\) is \(L^2\)-observable, and further \(\Sigma ^d\) is observable by what we just proved; hence \(\Sigma \) is controllable by Corollary 2.7.

Now assume that \(\Sigma \) is \(L^2\)-controllable and that \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\). Then \({\text {dom}}({\mathbf {W}}_c^\bigstar )\) is dense by definition, and according to Theorem 3.4, we have \(X={\text {ran}}({\mathbf {W}}_c)\subset {\text {dom}}({\mathbf {W}}_o)\), so that \({\mathbf {W}}_o\) is bounded by the closed graph theorem. This completes the proof of item (1), and the proof of item (2) is easy using duality.

In conclusion we prove item (3). By assumption the ranges of \({\mathbf {W}}_c\) and \({\mathbf {W}}_o^*\) are equal to X. From items (1) and (2) we obtain that \({\mathbf {W}}_c\) and \({\mathbf {W}}_o^*\) are bounded. The boundedness of \({\mathbf {W}}_c\) and \({\mathbf {W}}_o^*\) together with \({\text {ran}}({\mathbf {W}}_c)=X={\text {ran}}({\mathbf {W}}_o^*)\) yields that \({\mathbf {W}}_c\) and \({\mathbf {W}}_o^*\) have bounded right inverses, or, equivalently, \({\mathbf {W}}_c^*\) and \({\mathbf {W}}_o\) have bounded left inverses, and hence the latter are bounded below. \(\square \)

4 System Nodes and Well-posed Linear Systems

The well-posed systems considered in the present paper can alternatively be formulated in a differential representation via a so-called system node \( \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \). In this section we review some of the details of system nodes and describe some related topics relevant for the paper, including a reformulation of the KYP-inequality in terms of system nodes. See Chapters 3 and 4 of [31] for full details and many more results on system nodes.

4.1 Construction of the System Node

Let \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) be a well-posed linear system as in Definitions 2.1 and 2.2. Let A on X be the infinitesimal generator of the \(C_0\)-semigroup \({\mathfrak {A}}^t\), that is,

$$\begin{aligned} {\text {dom}}(A)=\left\{ x\in X \biggm \vert \lim _{h\rightarrow 0} \frac{1}{h}({\mathfrak {A}}^tx -x) \text { exists} \right\} ,\quad Ax=\lim _{h\rightarrow 0} \frac{1}{h}({\mathfrak {A}}^t x-x). \end{aligned}$$

Now fix the rigging parameter \(\beta \in \rho (A)\) arbitrarily and define the interpolation space \(X_1:={\text {dom}}(A)\) with the Hilbert space norm \(\Vert x\Vert _1:=\Vert (\beta -A)x\Vert _X\); then \(\alpha -A\) is an isomorphism from \(X_1\) to X for all \(\alpha \in \rho (A)\). Next complete X in the norm \(\Vert x\Vert _{-1}:=\Vert (\beta -A)^{-1}x\Vert _X\) to get the extrapolation space \(X_{-1}\). Then we have the chain of inclusions

$$\begin{aligned} X_1\subset X\subset X_{-1} \end{aligned}$$
(4.1)

with dense and continuous embeddings, and the spaces \(X_1\), X and \(X_{-1}\) form a Gelfand triple. Moreover, the generator A extends uniquely to a bounded operator \(A_{-1}\) in \({{\mathcal {B}}}(X, X_{-1})\) which in turn is the infinitesimal generator of a \(C_0\)-semigroup \({\mathfrak {A}}_{-1}^t\) on \(X_{-1}\) which extends \({\mathfrak {A}}^t\). The resolvent set \(\rho (A_{-1})\) equals \(\rho (A)\); see [31,  §3.6] for further details.

By Theorems 4.2.1 and 4.4.2 in [31] there exist bounded operators \(B\in {{\mathcal {B}}}(U,X_{-1})\), the control operator, and \(C\in {{\mathcal {B}}}(X_1,Y)\), the observation operator, that are uniquely determined by the formulas

$$\begin{aligned} {\mathfrak {B}}{{\mathbf {u}}}=\int _{-\infty }^0 {\mathfrak {A}}_{-1}^{-s} B {{\mathbf {u}}}(s) \,{\mathrm {d}}s, \quad {{\mathbf {u}}}\in L^{2-}_{\ell ,U},\qquad ({\mathfrak {C}}x)(t)=C{\mathfrak {A}}^t x,\quad x\in X_1. \end{aligned}$$
(4.2)

Note that while B maps into \(X_{-1}\) and \({\mathfrak {A}}_{-1}^t\) acts on \(X_{-1}\), the result after integration in the first formula still ends up in X.

With A and B defined as above we can form a closed operator \( A \& B :\left[ {\begin{matrix}X \\ U \end{matrix}}\right] \supset {\text {dom}}({A \& B}) \rightarrow X\) by

$$ \begin{aligned} {\text {dom}}({A \& B})= \left\{ \begin{bmatrix} x \\ u \end{bmatrix} \biggm \vert A_{-1} x + B u \in X \right\} \quad \text {and}\quad {A \& B}\begin{bmatrix} x \\ u \end{bmatrix} = A_{-1} x + B u. \end{aligned}$$

Choose a fixed \(\alpha \in {{\mathbb {C}}}_{\omega _{\mathfrak {A}}}\). For \( \left[ {\begin{matrix} x \\ u \end{matrix}}\right] \in {\text {dom}}({A \& B})\), we then have

$$\begin{aligned} x - (\alpha - A_{-1})^{-1} B u =&(\alpha - A_{-1})^{-1} \big ( \alpha x - (A_{-1} x + Bu)\big ) \\&\in (\alpha - A)^{-1} X = X_1 = {\text {dom}}(C). \end{aligned}$$

From \({\mathfrak {D}}\), we can compute the transfer function \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{\mathbb {C}}}_\omega ;{{\mathcal {B}}}(U,Y))\), \(\omega >\omega _{\mathfrak {A}}\), of \(\Sigma \) via Proposition 2.3. Since \(\alpha \in {{\mathbb {C}}}_{\omega _{\mathfrak {A}}}\), we can evaluate \({\widehat{{\mathfrak {D}}}}(\alpha )\), and then define

$$ \begin{aligned} C \& D :\begin{bmatrix} x \\ u \end{bmatrix} \mapsto C \big (x - (\alpha - A_{-1})^{-1} B u\big )+ {\widehat{{\mathfrak {D}}}}(\alpha )u. \end{aligned}$$
(4.3)

Note that if \(x \in X_1\), then \( \left[ {\begin{matrix} x \\ 0 \end{matrix}}\right] \in {\text {dom}}({A \& B})\) and \( {C \& D}\left[ {\begin{matrix} x \\ 0 \end{matrix}}\right] = C x\). In general there is no sensible way to separate out an independent feedthrough operator \(D \in {{\mathcal {B}}}(U,Y)\) except in some special cases, e.g., if at least one of \(B :X \rightarrow U\) and \(C :X \rightarrow Y\) is bounded (see Theorems 4.5.2 and 4.5.10 in [31]), or if \(\Sigma \) is regular (see Chapter 5 in [31]). Rather we think of \( C \& D\) as an extension of the operator C defined on \(X_1 \cong \left[ {\begin{matrix} X_1 \\ 0 \end{matrix}}\right] \subset \left[ {\begin{matrix} X \\ U \end{matrix}}\right] \) to the operator \( C \& D\) defined on \( {\text {dom}}({A \& B})\supset \left[ {\begin{matrix} X_1 \\ 0 \end{matrix}}\right] \) and mapping into X. After the above steps, we can introduce the system node \( \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] :\left[ {\begin{matrix} X \\ U \end{matrix}}\right] \supset {\text {dom}}(\left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] ) \rightarrow \left[ {\begin{matrix} X \\ Y \end{matrix}}\right] \) with

$$ \begin{aligned} {\text {dom}}(\left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] ) = {\text {dom}}({A \& B}) = {\text {dom}}({C \& D}) \end{aligned}$$

and action

$$ \begin{aligned} \begin{bmatrix} {A \& B}\\ {C \& D}\end{bmatrix} :\begin{bmatrix} x \\ u \end{bmatrix} \mapsto \begin{bmatrix} {A \& B}\left[ {\begin{matrix}x \\ u \end{matrix}}\right] \\ {C \& D}\left[ {\begin{matrix} x \\ u \end{matrix}}\right] \end{bmatrix}. \end{aligned}$$

We next recall Definition 4.7.2 in [31].

Definition 4.1

Suppose that \( {{\mathbf {S}}}:=\left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] \) is an operator mapping a dense subspace \({\text {dom}}({{\mathbf {S}}})\) of \(\left[ {\begin{matrix} X \\ U \end{matrix}}\right] \) into \(\left[ {\begin{matrix} X \\ Y \end{matrix}}\right] \). We shall say that \({{\mathbf {S}}}\) is a system node if it has the following properties:

  1. (1)

    \({{\mathbf {S}}}\) is closed as an operator from \(\left[ {\begin{matrix} X \\ U \end{matrix}}\right] \) into \(\left[ {\begin{matrix} X \\ Y \end{matrix}}\right] \).

  2. (2)

    The operator \(A :X \supset {\text {dom}}(A) \rightarrow X\) defined by \( A x = A \& B \left[ {\begin{matrix} x \\ 0 \end{matrix}}\right] \) on \({\text {dom}}(A) = \{ x \in X \bigm \vert \left[ {\begin{matrix} x \\ 0 \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}}) \}\) has domain dense in X, and A as an unbounded operator on X generates a \(C_0\)-semigroup on X.

  3. (3)

    The operator \( A \& B\) (with \( {\text {dom}}({A \& B}) = {\text {dom}}({{\mathbf {S}}})\)) can be extended to an operator

    $$\begin{aligned} \begin{bmatrix} A_{-1}&B \end{bmatrix} \in {{\mathcal {B}}}(\left[ {\begin{matrix} X \\ U \end{matrix}}\right] , X_{-1}) \end{aligned}$$

    (where \(X_{-1}\) is the extrapolation space introduced in (4.1)).

  4. (4)

    \({\text {dom}}({{\mathbf {S}}}) = \big \{ \left[ {\begin{matrix} x \\ u \end{matrix}}\right] \in \left[ {\begin{matrix} X \\ U \end{matrix}}\right] \bigm \vert A_{-1} x + B u \in X\}\).

Given a system node \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) we may define its transfer function \({\widehat{{\mathfrak {D}}}}_{{\mathbf {S}}}(\lambda )\) by

$$ \begin{aligned} {\widehat{{\mathfrak {D}}}}_{{\mathbf {S}}}(\lambda )u = C \& D \begin{bmatrix} (\lambda - A_{-1})^{-1} B \\ 1_U\end{bmatrix} u,\quad \lambda \in \rho (A). \end{aligned}$$
(4.4)

If \({\widehat{{\mathfrak {D}}}}\) is constructed as in Proposition 2.3, then \({\widehat{{\mathfrak {D}}}}_{{\mathbf {S}}}\) is an extension of \({\widehat{{\mathfrak {D}}}}\) from \({{\mathbb {C}}}_{\omega _{\mathfrak {A}}}\) to all of \(\rho (A)\); see [31,  Lemma 4.7.5(iii)].

We end this subsection with a result which says that a system node works as the connecting operator of a well-posed system.

Lemma 4.2

(See [31,  Theorem 4.6.11(i)].) Suppose that \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) is a well-posed system with associated system node \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \). Let \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) be a system trajectory over \({{\mathbb {R}}}^+\) with state initial condition \({{\mathbf {x}}}(0) = x_0\) and with \({{\mathbf {u}}}\) continuous with distributional derivative \({\dot{{{\mathbf {u}}}}}\) in \(L^{2+}_{loc, U}\) and such that \(\left[ {\begin{matrix} x_0 \\ {{\mathbf {u}}}(0) \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}})\). Then \({{\mathbf {x}}}\) is continuously differentiable with values in X, \(\left[ {\begin{matrix} {{\mathbf {x}}}(t) \\ {{\mathbf {u}}}(t) \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}})\) for all \(t>0\), \({{\mathbf {y}}}\) is continuous with distributional derivative \({\dot{{{\mathbf {y}}}}}\) in \(L^{2+}_{loc,Y}\), and

$$\begin{aligned} \begin{bmatrix} {\dot{{{\mathbf {x}}}}}(t) \\ {{\mathbf {y}}}(t) \end{bmatrix} = {{\mathbf {S}}}\begin{bmatrix} {{\mathbf {x}}}(t) \\ {{\mathbf {u}}}(t) \end{bmatrix},\quad t \ge 0. \end{aligned}$$
(4.5)

4.2 Reconstruction of the Well-posed System

With the system node \( {{\mathbf {S}}}=\left[ {\begin{matrix} {A \& B}\\ {C \& D} \end{matrix}}\right] \) constructed from \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) as above, it is possible to recover \({\mathfrak {A}}\), \({\mathfrak {B}}\), \({\mathfrak {C}}\), \({\mathfrak {D}}\) and the transfer function \(\widehat{{\mathfrak {D}}}\) from \( \left[ {\begin{matrix} {A \& B}\\ {C \& D} \end{matrix}}\right] \). We first sketch this construction, and only afterwards, we discuss the rigour of the construction.

Clearly \({\mathfrak {A}}^t\) is the \(C_0\)-semigroup generated by A and \({\mathfrak {B}}\) and \({\mathfrak {C}}\) can be recovered via (4.2), taking for \({\mathfrak {C}}\) the unique continuous extension from \(X_1\) to X mapping into \(L^{2+}_{loc,Y}\). Finally, by [31,  Theorem 4.7.14] and its proof, \({\mathfrak {D}}\) can be recovered as the unique extension to a continuous operator from \(L^2_{\ell ,loc,U}\) to \(L^2_{\ell ,loc,Y}\) of the operator

$$ \begin{aligned} {\mathfrak {D}}{{\mathbf {u}}}= t\mapsto {C \& D}\begin{bmatrix}{\mathfrak {B}}^t{{\mathbf {u}}}\\ {{\mathbf {u}}}(t)\end{bmatrix}, \quad t\in {{\mathbb {R}}}, \end{aligned}$$
(4.6)

defined for \({{\mathbf {u}}}\in H^1_{0,loc}({{\mathbb {R}}};U)\) with support bounded to the left; see (2.8) for the definition of this space.

We have seen that the operator \( \left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] \) arising from a well-posed system \(\Sigma \) as described in §4.1 is a system node. However, in general, for a system node to give rise to a well-posed system via the above construction more is needed. We shall follow Definition 10.1.1 in [31] and use the following terminology: given A equal to the generator of \(C_0\)-semigroup on X and operators \(B \in {{\mathcal {B}}}(U, X_{-1})\) and \(C \in {{\mathcal {B}}}(X_1, Y)\), we say that:

  • B is an \(L^2\)-admissible (here abbreviated to admissible) control operator for A if the operator \({\mathfrak {B}}\) defined as in (4.2) maps \(L^{2-}_{\ell , U}\) into X.

  • C is an \(L^2\)-admissible (here abbreviated to admissible) observation operator for A if the operator \({\mathfrak {C}}\) defined as in (4.2) is continuous as an operator from X to \(L^{2+}_{loc, Y}\).

The following result describes what additional conditions must be imposed on a system node, in order to conclude that it induces a well-posed system.

Theorem 4.3

Suppose that \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) is a system node as defined above. Suppose that the semigroup \(t \mapsto {\mathfrak {A}}^t\) has growth bound \(\omega _{\mathfrak {A}}\) and let \(\omega \) be any real number satisfying \(\omega > \omega _{\mathfrak {A}}\). Then there is a well-posed system \(\left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) such that \({{\mathbf {S}}}\) is the system node arising from \(\Sigma \) if and only if

  1. (1)

    the operator \(B :U \rightarrow X_{-1}\) is admissible for A,

  2. (2)

    the operator \(C :X_1 \rightarrow Y\) is admissible for A, and

  3. (3)

    the system-node transfer function \({\widehat{{\mathfrak {D}}}}_{{\mathbf {S}}}\) (4.4) is in \(H^\infty ({{\mathbb {C}}}_\omega ;{{\mathcal {B}}}(U,Y))\).

Explicitly, when conditions (1), (2), (3) are satisfied, the associated well-posed system \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) is given by

  • \(t \mapsto {\mathfrak {A}}^t\) is the \(C_0\)-semigroup generated by A,

  • \({\mathfrak {B}}\) and \({\mathfrak {C}}\) are given by formulas (4.2), and

  • \({\mathfrak {D}}\in {{\mathcal {B}}}(L^2_{\ell , loc, U}, L^2_{\ell , loc, Y})\) is a continuous extension of the operator acting on smooth input functions \({{\mathbf {u}}}\) given by the formula (4.6).

In this case the associated system \(\Sigma \) is \(\omega \)-bounded, i.e., (2.7) holds.

Proof

Assume that \({{\mathbf {S}}}\) satisfies conditions (1), (2) and (3) in the statement of the theorem. Conditions (1) and (2) just say that conditions (i) and (ii) in Theorem 4.7.14 of [31] are met; once we have proved condition (iii) of this theorem, we may conclude \({{\mathbf {S}}}\) is an \(L^2\)-well-posed system node, which, by Definition 4.7.2 in [31], implies that the constructed system \(\Sigma \) is well-posed. As a consequence of the Paley-Wiener Theorem [31,  Theorem 10.3.4], it follows from Theorem 10.3.5 in [31] that condition (3) is equivalent to \({\widehat{{\mathfrak {D}}}}_{{\mathbf {S}}}\) being the transfer function of an operator \({\mathfrak {D}}\) in \(\text{ TIC}^2_\omega (U,Y)\), that is, a causal, time-invariant operator in \({{\mathcal {B}}}(L^2_{\omega ,U},L^2_{\omega ,Y})\). It then follows from Lemma 2.6.4 in [31] that \({\mathfrak {D}}\) has a unique “extension after restriction” to an operator in \(\text{ TIC}^2_{loc}(U,Y)\), which means it is a continuous, causal, time-invariant operator from \(L^2_{\ell ,loc,U}\) into \(L^2_{\ell ,loc,U}\), which is precisely what is required for the remaining condition (iii) in Theorem 4.7.14 of [31]. We may thus conclude that \(\Sigma \) constructed from \({{\mathbf {S}}}\) is a well-posed system, which generates the system node \({{\mathbf {S}}}\) in the way described in Sect. 4.1. It then follows from the reverse construction in Sect. 4.2 preceding this theorem that the operator \({\mathfrak {D}}\) is indeed given by (4.6).

That the operators \({\mathfrak {A}}\), \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\) that constitute the well-posed system \(\Sigma \) are \(\omega \)-bounded, follows from the discussion in Sect. 2 after Definition 2.2.

Conversely, suppose that \(\Sigma \) constructed from \({{\mathbf {S}}}\) in the theorem is a well-posed system. Then it has \(\omega _{\mathfrak {A}}\) as growth bound, so that \({\mathfrak {A}}\), \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\) are \(\omega \)-bounded, by the above argument. The properties (1)–(3) now follow from Theorem 10.3.6 in [31]. \(\square \)

4.3 Duality Between Admissible Control/Observation Operators for A/\(A^*\)

Here we briefly point out the duality between admissible input pairs (AB) and admissible output pairs (CA); see also [31,  Theorem 6.2.13]. Let A be the generator of a \(C_0\)-semigroup \({\mathfrak {A}}\), \(B \in {{\mathcal {B}}}(U, X_{-1})\) and \(C \in {{\mathcal {B}}}(X_{1},Y)\).

Let us define \(A^*\) in the standard way as an unbounded operator on X, and let \(X_1^d\subset X\subset X_{-1}^d\) be the Gelfand triple as in (4.1), but for \(A^*\) and using the parameter \({\overline{\beta }}\in \rho (A^*)\) in place of the operator A and the parameter \(\beta \in \rho (A)\). Next define \(B^* \in {{\mathcal {B}}}(X_1^d, U)\) by identifying U and X with their duals and by viewing \(X_{-1}^d\) as the dual of \(X_1\) via the X-inner product to define the duality pairing:

$$\begin{aligned} \left\langle x , z \right\rangle _{X_1,X_{-1}^d}=\left\langle x , z \right\rangle _X, \qquad x\in X_1,~z\in X. \end{aligned}$$

Define \(C^* \in {{\mathcal {B}}}(Y, X_{-1}^d)\) analogously. When this is done it is a matter of verification to see that the operator \(B^*\) is an admissible observation operator for \(A^*\) if and only if B is an admissible control operator for A. Similarly, if C is an admissible observation operator for A , then \(C^*\) is an admissible control operator for \(A^*\), and vice versa.

Together with the transfer function

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}^\sharp (\lambda ):= {\widehat{{\mathfrak {D}}}}({\overline{\lambda }})^*, \qquad \lambda \in \rho (A^*), \end{aligned}$$

evaluated at some arbitrary \(\alpha \in \rho (A^*)\), the operators \(A^*\), \(C^*\) and \(B^*\) amount to an infinitesimal version of the duality between \(\Sigma \) and \(\Sigma ^d\) described in Theorem 2.4; in fact, the system node for the causal dual \(\Sigma ^d\) is

$$ \begin{aligned} \begin{bmatrix}{A \& B}\\ {C \& D}\end{bmatrix}^*:\begin{bmatrix}X\\ Y\end{bmatrix}\supset {\text {dom}}(\left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] ^*)\rightarrow \begin{bmatrix}X\\ U\end{bmatrix}, \end{aligned}$$

in the standard sense of unbounded adjoints.

4.4 KYP-inequalities in terms of System Nodes

In this subsection we show how the standard KYP-inequality (1.13), the strict KYP-inequality (1.16), and for the semi-strict KYP-inequality (1.17) can be expressed in terms of the system node \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) rather than in terms of the well-posed system \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \), at least for the case where H is bounded and strictly positive-definite. The main tool will be Lemma 4.2.

Theorem 4.4

Suppose that \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) is a well-posed system with corresponding system node \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \). Then the \(\Sigma \)-KYP inequalities (1.13), (1.16) and (1.17) correspond to \({{\mathbf {S}}}\)-KYP inequalities as follows.

  1. (1)

    A bounded selfadjoint operator solves the standard KYP inequality (1.13) if and only if H satisfies the standard \({{\mathbf {S}}}\)-KYP inequality:

    $$ \begin{aligned} 2 {\text {Re}}\,\langle H (A \& B) \left[ {\begin{matrix} x \\ u \end{matrix}}\right] , x \rangle + \Vert (C \& D) \left[ {\begin{matrix} x \\ u \end{matrix}}\right] \Vert ^2 \le \Vert u \Vert ^2,\quad \left[ {\begin{matrix} x \\ u \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}}). \end{aligned}$$
    (4.7)
  2. (2)

    A bounded selfadjoint operator on X satisfies the strict KYP inequality (1.16) if and only if H satisfies the strict \({{\mathbf {S}}}\)-KYP inequality:

    $$ \begin{aligned} 2 {\text {Re}}\,\langle H ( A \& B ) \left[ {\begin{matrix} x \\ u \end{matrix}}\right] , x \rangle + \Vert C \& D \left[ {\begin{matrix} x \\ u \end{matrix}}\right] \Vert ^2 + \delta \Vert x \Vert ^2 \le \langle Hx, x \rangle + (1 - \delta ) \Vert u \Vert ^2\nonumber \\ \end{aligned}$$
    (4.8)

    for all \(\left[ {\begin{matrix} x \\ u \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}})\).

  3. (3)

    A bounded selfadjoint operator on X satisfies the semi-strict KYP inequality (1.17) if and only if H satisfies the semi-strict \({{\mathbf {S}}}\)-KYP inequality:

    $$ \begin{aligned} 2 {\text {Re}}\,\langle H ( A \& B ) \left[ {\begin{matrix} x \\ u \end{matrix}}\right] , x \rangle + \Vert C \& D \left[ {\begin{matrix} x \\ u \end{matrix}}\right] \Vert ^2 \le \langle Hx, x \rangle + (1 - \delta ) \Vert u \Vert ^2 \end{aligned}$$

    for all \(\left[ {\begin{matrix} x \\ u \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}})\).

Proof of statement (1)

Suppose first that is a selfadjoint operator satisfying the standard KYP inequality (1.13). Let us apply (1.13) to the case where \(x = {{\mathbf {x}}}(0)\) and \({{\mathbf {u}}}\) is equal to the input signal for a smooth trajectory \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) in the sense of Lemma 4.2. Recalling the definition of the action of \(\left[ {\begin{matrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{matrix}}\right] \), we see that

$$\begin{aligned} \Vert H^{\frac{1}{2}} {{\mathbf {x}}}(t) \Vert ^2 + \int _0^t \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \le \Vert H^{\frac{1}{2}} {{\mathbf {x}}}(0) \Vert ^2 + \int _0^t \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s \end{aligned}$$
(4.9)

for all \(t \ge 0\). As \({{\mathbf {x}}}\) is continuously differentiable and \({{\mathbf {u}}}\) and \({{\mathbf {y}}}\) are continuous, we may move \(\Vert H^{\frac{1}{2}}{{\mathbf {x}}}(0)\Vert ^2\) over to the left-hand side in (4.9), divide by t, let \(t\rightarrow 0\), and finally observe that

$$\begin{aligned} \frac{\,{\mathrm {d}}}{\,{\mathrm {d}}s} \langle H {{\mathbf {x}}}(s), {{\mathbf {x}}}(s) \rangle = 2 {\text {Re}}\,\langle H \dot{x}(s), x(s) \rangle \,, \end{aligned}$$
(4.10)

in order to arrive at

$$\begin{aligned} 2 {\text {Re}}\,\langle H {\dot{{{\mathbf {x}}}}}(0), {{\mathbf {x}}}(0) \rangle + \Vert {{\mathbf {y}}}(0) \Vert ^2 \le \Vert {{\mathbf {u}}}(0) \Vert ^2. \end{aligned}$$

Plugging in the differential system equations (4.5) then leads to

$$ \begin{aligned} 2 {\text {Re}}\,\left\langle H ( A \& B) \begin{bmatrix} x_0 \\ {{\mathbf {u}}}(0)\end{bmatrix}, x_0 \right\rangle + \left\| C \& D \begin{bmatrix} x_0 \\ {{\mathbf {u}}}(0)\end{bmatrix}\right\| ^2 \le \Vert {{\mathbf {u}}}(0) \Vert ^2\,, \end{aligned}$$

where \(\left[ {\begin{matrix} x_0 \\ {{\mathbf {u}}}(0) \end{matrix}}\right] \) can be an arbitrary element of \({\text {dom}}({{\mathbf {S}}})\), thereby arriving at (4.7) as wanted.

Conversely, if H satisfies (4.7), we evaluate (4.7) at \(\left[ {\begin{matrix} x \\ u \end{matrix}}\right] = \left[ {\begin{matrix} {{\mathbf {x}}}(s) \\ {{\mathbf {u}}}(s) \end{matrix}}\right] \) taken from a smooth system trajectory \(({{\mathbf {u}}}(t), {{\mathbf {x}}}(t), {{\mathbf {y}}}(t))\) as in Lemma 4.2 to get

$$ \begin{aligned} 2 {\text {Re}}\,\left\langle H (A \& B) \begin{bmatrix} {{\mathbf {x}}}(s) \\ {{\mathbf {u}}}(s) \end{bmatrix}, {{\mathbf {x}}}(s) \right\rangle + \left\| C \& D \begin{bmatrix} {{\mathbf {x}}}(s) \\ {{\mathbf {u}}}(s) \end{bmatrix} \right\| ^2 \le \Vert {{\mathbf {u}}}(s) \Vert ^2. \end{aligned}$$

Due to the differential system equations (4.5) we can rewrite this last expression as

$$\begin{aligned} 2 {\text {Re}}\,\langle H {\dot{{{\mathbf {x}}}}}(s), {{\mathbf {x}}}(s) \rangle + \Vert {{\mathbf {y}}}(s) \Vert ^2 \le \Vert {{\mathbf {u}}}(s)\Vert ^2 \end{aligned}$$
(4.11)

for all \(s \ge 0\). Again using (4.10), we can integrate (4.11) from \(s=0\) to \(s=t\) to arrive at

$$\begin{aligned} \langle H {{\mathbf {x}}}(t), {{\mathbf {x}}}(t) \rangle - \langle H {{\mathbf {x}}}(0), {{\mathbf {x}}}(0) \rangle + \int _0^t \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \le \int _0^t \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s \end{aligned}$$

which we can interpret as saying that

$$\begin{aligned} \bigg \Vert \begin{bmatrix} H^{\frac{1}{2}} &{} 0 \\ 0 &{} I \end{bmatrix} \begin{bmatrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{bmatrix} \begin{bmatrix} x_0 \\ {{\mathbf {u}}}\end{bmatrix} \bigg \Vert \le \bigg \Vert \begin{bmatrix} H^{\frac{1}{2}} &{} 0 \\ 0 &{} I \end{bmatrix} \begin{bmatrix} x_0 \\ {{\mathbf {u}}}\end{bmatrix} \bigg \Vert , \end{aligned}$$

i.e., the KYP-inequality (1.13) holds for all \(\left[ {\begin{matrix} x_0 \\ {{\mathbf {u}}} \end{matrix}}\right] \in \left[ {\begin{matrix} X \\ L^2([0,t], U) \end{matrix}}\right] \) such that \({{\mathbf {u}}}\) is sufficiently smooth (in the sense used in Lemma 4.2) and \(\left[ {\begin{matrix} x_0 \\ {{\mathbf {u}}}(0) \end{matrix}}\right] \in {\text {dom}}({{\mathbf {S}}})\). Noting that the collection all such \(\left[ {\begin{matrix}x_0\\ {{\mathbf {u}}} \end{matrix}}\right] \) is dense in \( \left[ {\begin{matrix}X\\ L^2([0,t], U) \end{matrix}}\right] \), we see that (1.13) continues to hold on the space \(\left[ {\begin{matrix} X \\ L^2([0,t], U) \end{matrix}}\right] \) as wanted.

Proof of (2) and (3): The proofs of statements (2) and (3) follow in much the same way as that for (1). For the case of statement (2), if we assume that satisfies the strict bounded real lemma (1.16), apply the associated quadratic form to a vector of the form \(\left[ {\begin{matrix} {{\mathbf {x}}}(0) \\ {{\mathbf {u}}} \end{matrix}}\right] \) coming from a smooth system trajectory \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\), and then also take into account the interpretation (1.15) for the operator \(\begin{bmatrix} {\mathfrak {C}}^t_{1_X,A}&{\mathfrak {D}}^t_{A,B} \end{bmatrix}\), we can interpret (1.16) as saying that

$$\begin{aligned} \bigg \Vert \begin{bmatrix} H^{\frac{1}{2}} &{} 0 \\ 0 &{} I \end{bmatrix} \begin{bmatrix} {{\mathbf {x}}}(t) \\ {{\mathbf {y}}}(t) \end{bmatrix} \bigg \Vert ^2 + \delta \int _0^t \Vert {{\mathbf {x}}}(s) \Vert ^2 \,{\mathrm {d}}s \le \Vert H^{\frac{1}{2}} {{\mathbf {x}}}(0) \Vert ^2 + (1 - \delta ) \int _0^t \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s. \end{aligned}$$

The above argument for statement (1) then leads us to the conclusion that the differential form (4.8) is equivalent to the integrated form (1.16).

Statement (3) follows in much the same way. One repeats the argument used for statement (2) while ignoring the term

$$\begin{aligned} \delta \begin{bmatrix} ({\mathfrak {C}}^t_{1_X, A})^* \\ ( {\mathfrak {D}}^t_{A,B})^* \end{bmatrix} \begin{bmatrix} {\mathfrak {C}}^t_{1_X, A}&{\mathfrak {D}}^t_{A,B} \end{bmatrix} \end{aligned}$$

in (1.16) and the term \(\delta \Vert x \Vert ^2\) in (4.8). \(\square \)

Remark 4.5

Arov and Staffans [5] have worked out a generalized KYP-inequality for the infinite dimensional, continuous-time setting with solution H possibly unbounded formulated directly in terms of the system node \( {{\mathbf {S}}}= \left[ {\begin{matrix} A \& B \\ C \& D \end{matrix}}\right] \) (see Definition 5.6 and Theorem 5.7 there) to characterize when the transfer function of \({{\mathbf {S}}}\) is in the Schur class. It suffices to say here that the definition of solution there involves several auxiliary conditions in addition to the actual spatial operator inequality, all of which collapse to the inequality (4.7) in case H is bounded.

5 Examples of Systems with \(L^2\)-minimality

In this section we consider a few concrete cases where the system \(\Sigma \) is \(L^2\)-minimal. In the first case we assume that the \(C_0\)-semigroup \({\mathfrak {A}}\) can be embedded into a \(C_0\)-group. We shall first recall some facts about \(C_0\)-groups; for further details we refer to [13,  §II.3] and §6.2 in [19]. By a \(C_0\)-group we mean a family of linear operators \(\{{\mathfrak {A}}^t \mid t \in {{\mathbb {R}}} \}\) on X such that

$$\begin{aligned} {\mathfrak {A}}^0=1_X,\qquad {\mathfrak {A}}^t {\mathfrak {A}}^s = {\mathfrak {A}}^{t+s} \text { for all } t, \, s \in {{\mathbb {R}}} \end{aligned}$$

and which is strongly continuous at 0:

$$\begin{aligned} \lim _{t \rightarrow 0} {\mathfrak {A}}^t x = x \text { for all } x \in X, \end{aligned}$$

where the limit is now taken from both sides and not just from the right as in the semigroup case. The generator of the \(C_0\)-group \(\{ {\mathfrak {A}}^t \mid t \in {{\mathbb {R}}} \}\) is defined to be the operator A with domain \({\text {dom}}(A)\) given by

$$\begin{aligned} {\text {dom}}(A) = \left\{ x \in X \biggm \vert \lim _{t \rightarrow 0} \frac{1}{t} ({\mathfrak {A}}^t x - x) \text { exists in } X \right\} , \end{aligned}$$

again with a two-sided limit, and with action then given by

$$\begin{aligned} A x = \lim _{t \rightarrow 0} \frac{1}{t} ({\mathfrak {A}}^t x - x),\quad x \in {\text {dom}}(A). \end{aligned}$$

Among various characterizations contained in the generation theorem for groups [13,  p. 79], an operator A is a generator of a \(C_0\)-group if and only if A and \(-A\) are both generators of \(C_0\)-semigroups, say \({\mathfrak {A}}_+^t\) and \({\mathfrak {A}}_-^t\), respectively, in which case we recover \({\mathfrak {A}}^t\) as

$$\begin{aligned} {\mathfrak {A}}^t x = {\left\{ \begin{array}{ll} {\mathfrak {A}}^t_+ x &{} \text {for } t \ge 0, \\ {\mathfrak {A}}_-^{-t} x &{}\text {for } t \le 0. \end{array}\right. } \end{aligned}$$

The well-known case of a unitary group \({\mathfrak {A}}^{-t} = ({\mathfrak {A}}^t)^* = ({\mathfrak {A}}^t)^{-1}\) is the special case where the generator A is skew-adjoint, \(A^* = -A\).

The above characterization of a \(C_0\)-group \({\mathfrak {A}}^t\) implies that the spectrum of the generator A is contained in a strip along the imaginary axis:

$$\begin{aligned} -\omega _{\mathfrak {A}}^-\leqslant {\text {Re}}\,(\lambda ) \leqslant \omega _{\mathfrak {A}}^+,\quad \text{ for } \text{ some }\quad \omega _{\mathfrak {A}}^-,\omega _{\mathfrak {A}}^+\in {{\mathbb {R}}}\end{aligned}$$
(5.1)

determined by the respective growth bounds of \({\mathfrak {A}}_+^t\) and \({\mathfrak {A}}_-^t\), see (2.5), and moreover

$$\begin{aligned} \Vert {\mathfrak {A}}^{t}_+ x \Vert \le M_+ e^{\omega ^+ t} \Vert x \Vert \quad \text{ and }\quad \Vert {\mathfrak {A}}_-^{t} x \Vert \le M_- e^{\omega ^- t} \Vert x \Vert ,\quad t\geqslant 0,\, x\in X, \end{aligned}$$
(5.2)

for all \(\omega ^\pm >\omega _{\mathfrak {A}}^\pm \) and corresponding \(M_\pm >0\). Using the group property, one can derive an upper and lower growth bound for the semigroup part:

Lemma 5.1

Let \({\mathfrak {A}}^t\) be a \(C_0\)-group with left and right growth bounds given by \(\omega _{\mathfrak {A}}^-, \omega _{\mathfrak {A}}^+\). Then for every \(\omega ^\pm >\omega _{\mathfrak {A}}^\pm \) there are constants \(\delta ,\rho >0\) such that

$$\begin{aligned} \delta \, e^{- \omega ^- t} \Vert x \Vert \leqslant \Vert {\mathfrak {A}}^t x \Vert \le \rho \, e^{\omega ^+ t} \Vert x ||,\quad t\geqslant 0,\, x\in X. \end{aligned}$$
(5.3)

Proof

Let \(M_->0\) and \(M_+>0\) be as in (5.2). Set \(\rho =M_+\) and \(\delta =M_-^{-1}\). The right-hand bound follows immediately. For the left-hand bound, in the second inequality in (5.2) replace x by \({\mathfrak {A}}^t x\) and use that \({\mathfrak {A}}_-^t={\mathfrak {A}}^{-t} = ({\mathfrak {A}}^t)^{-1}\) to arrive at \( \Vert x \Vert \le M_- e^{\omega ^- t} \Vert {\mathfrak {A}}^t x \Vert \), or equivalently, \(\Vert {\mathfrak {A}}^t x \Vert \ge \delta \, e^{- \omega ^- t} \Vert x \Vert \). \(\square \)

We say that a \(C_0\)-semigroup \({\mathfrak {A}}^t\) embeds into a \(C_0\)-group, if there exists a \(C_0\)-group (usually also denoted by \({\mathfrak {A}}\)) which coincides with the original semigroup \({\mathfrak {A}}^t\) for \(t \in {{{\mathbb {R}}}^{+}}\). The following proposition characterizes when a \(C_0\)-semigroup can be embedded into a \(C_0\)-group.

Proposition 5.2

For a \(C_0\)-semigroup \({\mathfrak {A}}^t\) the following are equivalent:

  1. (1)

    \({\mathfrak {A}}^t\) embeds into a \(C_0\)-group;

  2. (2)

    \({\mathfrak {A}}^t\) is invertible (in \({{\mathcal {B}}}(X)\)) for all \(t \ge 0\);

  3. (3)

    \({\mathfrak {A}}^t\) is invertible for some \(t > 0\).

Proof

Clearly (2) implies (3). The proposition on page 80 of [13] states the implication (3) \(\Rightarrow \) (1) and the remaining implication (1) \(\Rightarrow \) (2) is easy: for \(t\geqslant 0\) we have \({\mathfrak {A}}^t {\mathfrak {A}}^{-t}={\mathfrak {A}}^0=1_X={\mathfrak {A}}^{-t} {\mathfrak {A}}^{t}\), so that \({\mathfrak {A}}^t\) is invertible. \(\square \)

If \({\mathfrak {A}}^t\) is a \(C_0\)-semigroup that embeds into a \(C_0\)-group, then it should at least satisfy (5.3); the upper bound comes for free from the one-sided strong continuity. However, it is not necessarily the case that a \(C_0\)-semigroup \({\mathfrak {A}}^t\) satisfying (5.3) embeds into a \(C_0\)-group. Indeed, take \({\mathfrak {A}}^t = \tau _+^{-t}\) to be the right translation semigroup on \(L^2({{\mathbb {R}}}^+)\). Then \(\tau _+^{-t}\) (\(t \ge 0\)) is isometric and hence satisfies the lower estimate \(\Vert \tau _+^{-t} x \Vert \ge \delta e^{-\omega t} \Vert x \Vert \) with \(\delta = 1\) and \(\omega = 0\), but \(\tau _+^{-t}\) is not onto, and hence not invertible on \(L^2({{\mathbb {R}}}^+)\) for any \(t > 0\).

We next give some sufficient conditions which guarantee the \(L^2\)-controllability and/or \(L^2\)-observability of a given well-posed linear system \(\Sigma \). In fact, we will show that under the assumptions of the proposition, the system is exactly controllable and/or exactly observable in any time \(t>0\); see Definition 9.4.1 in [31].

Proposition 5.3

Suppose that \(\Sigma \) is a minimal well-posed linear system with transfer function \({\widehat{{\mathfrak {D}}}}\) in \(H^\infty ({{\mathbb {C}}}^+; {{\mathcal {B}}}(U, Y))\) and with its \(C_0\)-semigroup \({\mathfrak {A}}^t\) invertible on X for some (and hence all) \(t>0\). Then:

  1. (1)

    Assume there exists a closed subspace \(U_0\) of U such that the control operator \(B\in {{\mathcal {B}}}(U,X_{-1})\) maps \(U_0\) onto X (viewed as an algebraic subspace of \(X_{-1}\)). Then \(\Sigma \) is \(L^2\)-controllable.

  2. (2)

    Assume there exists a closed subspace \(Y_0\) of Y such that, for the observation operator \(C\in {{\mathcal {B}}}(X_1,Y)\), the operator \(P_{Y_0}C\) extends to a bounded operator from X into \(Y_0\) which is bounded below. Then \(\Sigma \) is \(L^2\)-observable.

  3. (3)

    Assume that B and C satisfy the conditions of (1) and (2), respectively. Then \(\Sigma \) is \(L^2\)-minimal.

Proof

Note that statement (2) follows from (1) applied to \(\Sigma ^d\) and that statement (3) follows simply by combining statements (1) and (2). Thus it suffices to consider in detail only statement (1). We may moreover consider the restricted system where the input signals are restricted to values in \(U_0\), since \(L^2\)-controllability of the restricted system implies \(L^2\)-controllability of the original system as long as \({{\mathbf {W}}}_c^\bigstar \) is densely defined for the original system. Hence we will without loss of generality assume that B maps U onto X in the sequel.

Since \(\Sigma \) is observable, by Corollary 3.5 we see that \({{\mathbf {W}}}_c^\bigstar \) is indeed densely defined. Then we may apply Proposition 3.2 to get that \(L^{2 -}_{\ell ,U} \subset {\text {dom}}({{\mathbf {W}}}_c)\) and \({{\mathbf {W}}}_c |_{L^{2 - }_{\ell , U}} = {\mathfrak {B}}\); then

$$\begin{aligned} {\text {Rea}}(\Sigma ) = {\text {ran}}({\mathfrak {B}})\subset {\text {ran}}({{\mathbf {W}}}_c). \end{aligned}$$

To show the \(L^2\)-controllability condition \( {\text {ran}}({{\mathbf {W}}}_c) = X\), we will actually show that \(\Sigma \) is exactly controllable in any finite time \(t>0\): For any \(x\in X\) and \(\delta > 0\), we will construct an input signal \({{\mathbf {u}}}\in L^2([-\delta , 0], U)\) such that \({\mathfrak {B}}{{\mathbf {u}}}= x\). For this, let \(x \in X\), and use the surjectivity of \(B \in {{\mathcal {B}}}(U, X)\) to find a \(u \in U\) such that \(Bu = x\). We are done if we can find \({{\mathbf {u}}}\in L^2([-\delta , 0], U)\) such that \({\mathfrak {B}}{{\mathbf {u}}}= Bu\), i.e.,

$$\begin{aligned} \int _{-\delta }^0 {\mathfrak {A}}^{-s} B {{\mathbf {u}}}(s) \,{\mathrm {d}}s = B u. \end{aligned}$$

As B is surjective, B has a bounded right inverse \(B^\dagger \), and it is easily checked that the function

$$\begin{aligned} {{\mathbf {u}}}(s) = \frac{1}{\delta } B^\dagger {\mathfrak {A}}^s B u,\quad \text {for } -\delta \le s \le 0 \end{aligned}$$

does the job. \(\square \)

Remark 5.4

For the infinite dimensional setting, the conditions on B and C in Proposition 5.3 are rather strong. Indeed, if X is infinite dimensional, the surjectivity of B forces that also the input space U is infinite dimensional, and similarly, injectivity of C forces \(\dim (Y)=\infty \). However these hypotheses are not so offensive in our application to the proof of the strict infinite dimensional BRL (Theorem 1.12 with proof to come in §8), as the idea is to embed the nominal system \(\Sigma \) (which may have finite dimensional input and/or output spaces) into an auxiliary system \(\Sigma _\varepsilon \) which does have infinite dimensional input and output spaces. The one remaining restrictive hypothesis in Proposition  5.3 (compared to the discrete-time setting of [10]) is that the semigroup can be embedded in a \(C_0\)-group. This appears to be unavoidable if one wants to achieve \(L^2\)-controllability (\(L^2\)-observability) with a bounded control (observation) operator. The following example agrees on this observation.

Example 5.5

Here we give an example of a strict Schur-class function \({\widehat{{\mathfrak {D}}}}\) from \(U:=\ell ^2({{{\mathbb {Z}}}^{+}})\) to \(Y:=U\). Later on, in Example 8.2 below, we shall complete the example by finding explicit the maximal and minimal, bounded and boundedly invertible solutions of the KYP inequality, as expected by Theorem 1.12.

Take \(X:=U\), with the canonical orthonormal basis \(\{\phi _n \mid n = 0, 1, 2, \dots \}\) where \(\phi _n \in \ell ^2\) has a one in position n and zeros elsewhere. Thus each vector \(x \in X = \ell ^2({{\mathbb {Z}}}_+)\) can be represented as \(x = \sum _{=0}^\infty x_n \phi _n\) where \(x_n = \langle x, \phi _n \rangle _{\ell ^2({{\mathbb {Z}}}_+)}\) and \(\sum _{n=0}^\infty |x_n|^2 < \infty \). Define A by

$$\begin{aligned} A :\sum _{n=0}^\infty x_n \phi _n = \sum _{n=0}^\infty -(n+1)x_n \phi _n \end{aligned}$$

with \( {\text {dom}}(A) = \{ x \in X \mid A x \in X\}\), i.e.,

$$\begin{aligned} {\text {dom}}(A) = \left\{ x = \sum _{n=0}^\infty x_n \phi _n \in \ell ^2({{\mathbb {Z}}}_+) \biggm \vert \sum _{n=0}^\infty (n+1)^2 |x_n|^2 < \infty \right\} . \end{aligned}$$

In particular \(\phi _n\in {\text {dom}}(A)\) for all n. By [31,  §4.9], A generates an exponentially stable diagonal contraction semigroup \({\mathfrak {A}}\) on X, which is determined by the condition

$$\begin{aligned} {\mathfrak {A}}^t\phi _n=e^{-(n+1)t}\phi _n,\qquad n=0,1,\ldots , \end{aligned}$$
(5.4)

since this function is the unique solution of the Cauchy problem \(\dot{x}=Ax\) with \(x(0)=\phi _n\):

$$\begin{aligned} {\frac{{\mathrm{d }}}{{\mathrm{d }}t}}e^{-(n+1)t}\phi _n= -(n+1)e^{-(n+1)t}\phi _n= Ae^{-(n+1)t}\phi _n,\qquad t\geqslant 0. \end{aligned}$$

Moreover, \(\Vert {\mathfrak {A}}^t\Vert =e^{-t}\), so that \({\mathfrak {A}}\) is also a contraction semigroup, and moreover

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\ln \Vert {\mathfrak {A}}^t\Vert }{t}=-1, \end{aligned}$$

which shows that \({{\mathbb {C}}}_{-1}\subset \rho (A)\).

Note that the Cayley transform \({\mathbf {A}}\) of the operator A is determined by

$$\begin{aligned} {\mathbf {A}}\phi _n=(1_X+A)(1_X-A)^{-1}\phi _n=-\frac{n}{2+n}\phi _n, \end{aligned}$$

and since \(-n/(2+n)\rightarrow -1\) as \(n\rightarrow \infty \), the spectral radius of \({\mathbf {A}}\) is 1. Hence the Cayley transform does not always map the generator of an exponentially stable semigroup to an operator which is exponentially stable in the discrete-time sense. Therefore, it is not possible to reduce the study of the strict bounded real lemma in continuous time to the discrete-time case in [9,  Theorem 1.6] by means of the Cayley transform, as was done for the non-strict case in [5]. Moreover, the semigroup \({\mathfrak {A}}\) cannot be embedded into a group, since (5.1) is violated.

Now observe that

$$\begin{aligned} \int _0^\infty \Vert {\mathfrak {A}}^t\phi _n\Vert ^2\,{\mathrm {d}}t=\frac{1}{2n+2}, \end{aligned}$$

and hence the unbounded operator \(C:=2(-A)^{\frac{1}{2}}\) gives \({{\mathbf {W}}}_ox=t\mapsto C{\mathfrak {A}}^t x\) bounded both from above and below, as an operator from X into \(L^{2+}_Y\), but with norm \(\sqrt{2}\) it is not the output map of a passive system; see Lemma 3.6. However, C is an infinite time admissible observation operator for \({\mathfrak {A}}\) and the pair (CA) is \(L^2\)-observable. If C is made essentially more unbounded, then it is no longer an admissible observation operator for \({\mathfrak {A}}\), and if C is made essentially more bounded, then we lose \(L^2\)-observability. By duality, \(B:=\frac{1}{2}(-A_{-1})^{\frac{1}{2}}\) is an admissible control operator for \({\mathfrak {A}}\) and (AB) is an \(L^2\)-controllable pair; note that \(A_{-1}\) is described by the same formula as A, but the domain is extended to all of X.

We now have the operators A, B and C. To get a system node we still need to fix the special point \(\alpha \in {{\mathbb {C}}}_{\omega _{\mathfrak {A}}}\) and the corresponding value of the transfer function \({\widehat{{\mathfrak {D}}}}(\alpha )\); for convenience we take \(\alpha =0\). The domain of the system node is

$$ \begin{aligned} {\text {dom}}({A \& B})=\left\{ \begin{bmatrix}x\\ u\end{bmatrix}\in \begin{bmatrix} X \\ U \end{bmatrix} \bigm \vert A_{-1}x+Bu \in X \right\} ,\quad {A \& B}=\begin{bmatrix}A_{-1}&B\end{bmatrix}\big |_{{\text {dom}}({A \& B})}, \end{aligned}$$

and the combined feedthrough/observation operator becomes

$$ \begin{aligned} {C \& D}\begin{bmatrix}x\\ u\end{bmatrix}=C\left( x+A_{-1}^{-1}Bu\right) +{\widehat{{\mathfrak {D}}}}(0)u, \quad \begin{bmatrix}x\\ u\end{bmatrix}\in {\text {dom}}({A \& B}). \end{aligned}$$
(5.5)

Specializing (5.5) to \(x=x_n\phi _n\) and \(u=u_m\phi _m\) gives

$$ \begin{aligned} {C \& D}\begin{bmatrix}x_n\phi _n\\ u_m\phi _m\end{bmatrix}=2\sqrt{n+1}\,x_n\phi _n +({\widehat{{\mathfrak {D}}}}(0)-1_U)\,u_m\phi _m,\quad x_n,u_n\in {{\mathbb {C}}}. \end{aligned}$$
(5.6)

On the other hand, specializing (5.5) to \(x=(\lambda -A_{-1})^{-1}Bu\), we get from (4.4) that the transfer function is

$$\begin{aligned} \begin{aligned} {\widehat{{\mathfrak {D}}}}(\lambda )u&= C\left( (\lambda -A_{-1})^{-1}Bu+A_{-1}^{-1}Bu\right) +{\widehat{{\mathfrak {D}}}}(0)u\\&=(-A)^{\frac{1}{2}}\lambda \,(\lambda -A)^{-1}A_{-1}^{-1}(-A_{-1})^{\frac{1}{2}}u+{\widehat{{\mathfrak {D}}}}(0)u\\&= -\lambda \,(\lambda -A)^{-1}u+{\widehat{{\mathfrak {D}}}}(0)u, \quad \lambda \in {{\mathbb {C}}}_{-1}, \end{aligned} \end{aligned}$$

where we in the last step used that \((-A)^{\frac{1}{2}}\) commutes with \((\lambda -A)^{-1}\) and \((-A_{-1})^{\frac{1}{2}}\) commutes with \(A_{-1}^{-1}\); it is easy to check directly that the m-accretive operator \(-A\) commutes with the bounded operator \((\lambda -A)^{-1}\); see [21,  Theorem 3.35 on p. 281].

Taking for instance \({\widehat{{\mathfrak {D}}}}(0):=0\), we get from [31,  Corollary 3.4.5] that \({\widehat{{\mathfrak {D}}}}\) is a Schur function, but letting \(\lambda \rightarrow \infty \) along the positive real line, we get from [31,  Theorem 3.2.9(iii)] that \({\widehat{{\mathfrak {D}}}}(\lambda )u=u\) for all \(u\in U\), and so \({\widehat{{\mathfrak {D}}}}\) is not a strict Schur function.

However, if we instead set \({\widehat{{\mathfrak {D}}}}(0):=\frac{1}{2} 1_U\), then we get

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}(\lambda )=- \lambda \,(\lambda -A)^{-1}+\frac{1}{2}=- \frac{1}{2}(\lambda +A)(\lambda -A)^{-1},\quad \lambda \in {{\mathbb {C}}}_{-1}, \end{aligned}$$
(5.7)

which satisfies \( \Vert {\widehat{{\mathfrak {D}}}}(\lambda )\Vert \leqslant \frac{1}{2}\) for \(\lambda \in {{{\mathbb {C}}}^{+}}\), i.e. this is a strict Schur function. In Example 8.2 below, we continue this example, in order to get two extremal solutions to the bounded KYP inequality (1.14) which are bounded both above and below.

Finally, we observe that, in both of the above cases, \({\widehat{{\mathfrak {D}}}}\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(X))\), and then [31,  Theorem 10.3.6(iv)] gives that the system node \( \left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] \) is well-posed, but it is not passive, as we already saw. We may, however, apply Theorem 1.10 to get that \( \left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] \) is similar to a passive system.

As the preceding example shows, \(L^2\)-minimality may be an exotic property. We further add to this conclusion by observing that, in general, unless the point spectrum of A is confined to a vertical strip, then the pair(AB) is not \(L^2\)-observable for any bounded operator \(B:U\rightarrow X\). Dually, no bounded \(C:X\rightarrow Y\) makes (CA) an \(L^2\)-observable pair; indeed \({{\mathbb {C}}}^+_{\omega _{\mathfrak {A}}}\subset \rho (A)\), and so if \(\sigma _p(A)\) is not contained in a vertical strip, then there exists eigenpairs \((\lambda _n,\phi _n)\) of A, such that \(\Vert \phi _n\Vert =1\) and \({\text {Re}}\,\lambda _n\rightarrow -\infty \) as \(n\rightarrow \infty \). Since \(\phi _n\in {\text {dom}}(A)\), we have for bounded C and \({\text {Re}}\,\lambda _n<0\) that

$$\begin{aligned} \Vert {\mathfrak {C}}\phi _n\Vert ^2_{L^{2+}_Y}= \int _0^\infty \Vert C{\mathfrak {A}}^t\phi _n\Vert ^2\,{\mathrm {d}}t\leqslant \Vert C\phi _n\Vert ^2\int _0^\infty e^{2{\text {Re}}\,\lambda _n t}\,{\mathrm {d}}t \leqslant \frac{\Vert C\Vert ^2}{-2{\text {Re}}\,\lambda _n}\,; \end{aligned}$$

here we used the extension of (5.4) to an arbitrary eigenpair. Thus \(\phi _n\in {\text {dom}}({{\mathbf {W}}}_o)\), and by letting \(n\rightarrow \infty \), we get from \({{\mathbf {W}}}_o\phi _n={\mathfrak {C}}\phi _n\) that \(\Vert {{\mathbf {W}}}_o\phi _n\Vert \rightarrow 0\) with \(\Vert \phi _n\Vert =1\). This proves that (CA) is not \(L^2\)-observable. The statement on controllability can be obtained by duality. Compare this to (5.1) and Remark 5.4.

We end this section by pointing out that observability can be strengthened into \(L^2\)-observability by weakening the norm in the state space and growing it, while strengthening controllability to \(L^2\)-controllability can be achieved by shrinking the state space and strengthening the norm to make the \(L^2\)-reachable state space Hilbert; see [31,  Theorem 9.4.7 and Proposition 9.4.9]. Note in particular the close relation between \(L^2\)-controllability/observability and the concepts “exact controllability/observability in infinite time (with bound \(\omega =0\))” used by Staffans; see [31,  Definitions 9.4.1–2]. A difference in the approach is that we here force \(\omega =0\) and accept that \({{\mathbf {W}}}_c\) and/or \({{\mathbf {W}}}_o\) may be unbounded, whereas in [31], Staffans is flexible about \(\omega \) in order to get \({\widetilde{{\mathfrak {B}}}}\) and \({\widetilde{{\mathfrak {C}}}}\) in (2.7) bounded.

6 The Available Storage and the Required Supply

In this section we return to the notion of storage functions associated with a well-posed system as in Definition 1.1, which we recall here for the readers convenience: A function \(S:X\rightarrow [0,\infty ]\) is called a storage function for the well-posed system \(\Sigma \) in (2.1) if \(S(0)=0\) and for all trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) on \({{\mathbb {R}}}^+\) it holds that

$$\begin{aligned} S\left( {{\mathbf {x}}}(t)\right) + \Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert _{L^{2+}_Y}^2 \leqslant S\left( {{\mathbf {x}}}(0)\right) + \Vert \pi _{[0,t]}{{\mathbf {u}}}\Vert _{L^{2+}_U}^2, \quad t>0. \end{aligned}$$
(6.1)

For systems \(\Sigma \) that have densely defined \({\mathbf {W}}_c^\bigstar \), \(L^2\)-regular storage functions are defined as those storage functions that are finite-valued on \({\text {ran}}({\mathbf {W}}_c)\). A storage function S is called quadratic if there exists a positive semidefinite operator H on X, such that

$$\begin{aligned} S(x)=S_H(x):=\left\{ \begin{aligned} \Vert H^{\frac{1}{2}}x \Vert ^2,&\quad x\in {\text {dom}}(H^{\frac{1}{2}}), \\ \infty ,&\quad x\not \in {\text {dom}}(H^{\frac{1}{2}}). \end{aligned}\right. \end{aligned}$$
(6.2)

Quadratic storage functions are of particular interest since they provide spatial solutions to the spatial KYP inequality.

Proposition 6.1

If the well-posed system \(\Sigma \) has a storage function S, then the transfer function \({\widehat{{\mathfrak {D}}}}\) of \(\Sigma \) has a unique analytic continuation to a Schur function on \({{{\mathbb {C}}}^{+}}\).

Proof

From (6.1) it is immediate that every trajectory of \(\Sigma \) on \({{{\mathbb {R}}}^{+}}\) with \({{\mathbf {u}}}\in L_U^{2+}\) and \(x(0)=0\) satisfies

$$\begin{aligned} 0\leqslant S\left( {{\mathbf {x}}}(t)\right) \leqslant \Vert \pi _{[0,t]}{{\mathbf {u}}}\Vert _{L^{2+}_U}^2-\Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert _{L^{2+}_Y}^2,\quad t>0. \end{aligned}$$
(6.3)

Letting \(t\rightarrow \infty \) in (6.3), we see that \({{\mathbf {y}}}\in L_Y^{2+}\), and we get from (2.3) that

$$\begin{aligned} \Vert {\mathfrak {D}}{{\mathbf {u}}}\Vert _{L^{2+}_Y}^2=\Vert {{\mathbf {y}}}\Vert _{L^{2+}_Y}^2 \leqslant \Vert {{\mathbf {u}}}\Vert _{L^{2+}_U}^2. \end{aligned}$$

From \(\pi _-{\mathfrak {D}}\pi _+=0\) and \(\tau ^s{\mathfrak {D}}={\mathfrak {D}}\tau ^s\) for all \(s\in {{\mathbb {R}}}\), we get

$$\begin{aligned} \Vert {\mathfrak {D}}\tau ^s{{\mathbf {u}}}\Vert ^2_{L^2_Y}=\Vert \tau ^s{\mathfrak {D}}{{\mathbf {u}}}\Vert ^2_{L^2_Y} =\Vert {\mathfrak {D}}{{\mathbf {u}}}\Vert ^2_{L^2_Y}=\Vert {\mathfrak {D}}{{\mathbf {u}}}\Vert ^2_{L^{2+}_Y} \leqslant \Vert {{\mathbf {u}}}\Vert ^2_{L^{2+}_U} =\Vert \tau ^s {{\mathbf {u}}}\Vert ^2_{L^2_U}. \end{aligned}$$

By letting s run over \({{\mathbb {R}}}\), we obtain that \({\mathfrak {D}}\) restricted to \(L^2_{\ell ,U}\) has a unique extension to a time-invariant, causal operator L from \(L^2_U\) into \(L^2_Y\) with norm at most 1. This implies that \({{\mathcal {L}}}L{{\mathcal {L}}}^*:L^2(i{{\mathbb {R}}};U)\rightarrow L^2(i{{\mathbb {R}}};Y)\) coincides with a multiplication operator \(M_F\) with symbol \(F\in H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(U,Y))\) satisfying \(\Vert F\Vert _\infty = \Vert {{\mathcal {L}}}L{{\mathcal {L}}}^*\Vert =\Vert L \Vert \leqslant 1\). Hence, \(F\in {{\mathcal {S}}}_{U,Y}\). Moreover, F is an extension of \({\widehat{{\mathfrak {D}}}}\), because for \(u\in L^{2+}_U\), by [31,  Corollary 4.6.10(iii)] (see the last part of Proposition 2.3) we have

$$\begin{aligned} F(\lambda )({{\mathcal {L}}}{{\mathbf {u}}})(\lambda ) = ({{\mathcal {L}}}L {{\mathbf {u}}})(\lambda ) = ({{\mathcal {L}}}{\mathfrak {D}}{{\mathbf {u}}})(\lambda ) = {\widehat{{\mathfrak {D}}}}(\lambda )({{\mathcal {L}}}{{\mathbf {u}}})(\lambda ), \quad \lambda \in {{\mathbb {C}}}_{\omega _0}, \end{aligned}$$

where \(\omega _0:=\max \left\{ \omega _{\mathfrak {A}},0 \right\} \). From \({{\mathcal {L}}}L^{2+}_U=H^{2+}_U\), we now get \({\widehat{{\mathfrak {D}}}}\big |_{{{\mathbb {C}}}_{\omega _0}}=F\big |_{{{\mathbb {C}}}_{\omega _0}}\). The continuation F of \({\widehat{{\mathfrak {D}}}}\) to the open connected set \({{{\mathbb {C}}}^{+}}\) is unique since \({{\mathbb {C}}}_{\omega _0}\) has an interior cluster point. \(\square \)

Proposition 6.2

Assume that \(S=S_H\) is of the form (6.2) with H on X positive semidefinite. Then \(S_H\) is a storage function for \(\Sigma \) if and only if H is a spatial solution to the KYP inequality (1.12)–(1.13).

Proof

Let S be quadratic, i.e., \(S=S_H\) as in (6.2) for some positive semidefinite operator H on X. First assume that S is a storage function for \(\Sigma \), so that (6.1) holds for all trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) on \({{{\mathbb {R}}}^{+}}\) of \(\Sigma \). Pick \(t>0\), \(x_0\in {\text {dom}}(H^{\frac{1}{2}})\) and \({{\mathbf {u}}}\in L^{2}([0,t];U)\) arbitrarily. By (2.1) and (2.3),

$$\begin{aligned} {{\mathbf {x}}}(t):={\mathfrak {A}}^t x_0+{\mathfrak {B}}^t {{\mathbf {u}}}\quad \text{ and }\quad \pi _{[0,t]}{{\mathbf {y}}}:= {\mathfrak {C}}^t x_0+{\mathfrak {D}}^t {{\mathbf {u}}},\quad t>0, \end{aligned}$$

define a trajectory \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) on [0, t] of \(\Sigma \) with \({{\mathbf {x}}}(0)=x_0\). Now (6.1) and \(S(x_0)<\infty \) imply that \(S({{\mathbf {x}}}(t))<\infty \), and hence that \({\mathfrak {A}}^t x_0+{\mathfrak {B}}^t {{\mathbf {u}}}={{\mathbf {x}}}(t)\in {\text {dom}}(H^{\frac{1}{2}})\). Taking first \({{\mathbf {u}}}=0\) and then \(x_0=0\), we get (1.12).

Since \(S=S_H\), we obtain that

$$\begin{aligned} \begin{aligned} S\left( {{\mathbf {x}}}(t)\right) +\Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert ^2 =\left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix}\begin{bmatrix}{{\mathbf {x}}}(t)\\ \pi _{[0,t]}{{\mathbf {y}}}\end{bmatrix}\right\| ^2&=\left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}\begin{bmatrix}x_0\\ {{\mathbf {u}}}\end{bmatrix}\right\| ^2\\ \text {and}\qquad S\left( x_0\right) +\Vert {{\mathbf {u}}}\Vert ^2&=\left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix}\begin{bmatrix}x_0\\ {{\mathbf {u}}}\end{bmatrix}\right\| ^2. \end{aligned} \end{aligned}$$

Hence (6.1) is equivalent to

$$\begin{aligned} \left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}\begin{bmatrix}x_0\\ {{\mathbf {u}}}\end{bmatrix}\right\| ^2 \leqslant \left\| \begin{bmatrix}H^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix}\begin{bmatrix}x_0\\ {{\mathbf {u}}}\end{bmatrix}\right\| ^2. \end{aligned}$$
(6.4)

Since \(x_0 \in {\text {dom}}(H^{\frac{1}{2}})\), \(t>0\) and \({{\mathbf {u}}}\in L^2([0,t];U)\) were chosen arbitrarily, we obtain (1.13). Conversely, it is clear that (1.13) implies (6.4) and hence that (6.1) holds. \(\square \)

Next we explain how solutions to the spatial KYP-inequality for a well-posed system relate to the solutions to the spatial KYP-inequality of the dual system.

Proposition 6.3

Let \(\Sigma \) be a well-posed system with causal dual \(\Sigma ^d\). A positive definite operator H on X is a spatial solution to the KYP-inequality for \(\Sigma \) if and only if \(H^{-1}\) is a spatial solution to the KYP-inequality for \(\Sigma ^d\): For all \(t>0\) it holds that

$$\begin{aligned} {\mathfrak {A}}^{t*}{\text {dom}}(H^{-\frac{1}{2}})\subset {\text {dom}}(H^{-\frac{1}{2}})\,,\quad {\mathfrak {C}}^{t*}L^2([0,t];Y)\subset {\text {dom}}(H^{-\frac{1}{2}}) \end{aligned}$$

and

$$\begin{aligned} \left\| \begin{bmatrix} H^{-\frac{1}{2}} &{} 0 \\ 0 &{} 1 \end{bmatrix} \begin{bmatrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{bmatrix}^* \begin{bmatrix} x \\ {{\mathbf {y}}}\end{bmatrix} \right\| \le \left\| \begin{bmatrix} H^{-\frac{1}{2}} &{} 0 \\ 0 &{} 1 \end{bmatrix} \begin{bmatrix} x \\ {{\mathbf {y}}}\end{bmatrix} \right\| , \quad \begin{bmatrix} x \\ {{\mathbf {y}}}\end{bmatrix} \in \begin{bmatrix} {\text {dom}}(H^{-\frac{1}{2}})\\ L^2([0,t], Y) \end{bmatrix}.\qquad \end{aligned}$$
(6.5)

The proof could be carried out by mechanically imitating the proof of [10,  Proposition 5.3], replacing \(\left[ {\begin{matrix}{\mathbf {A}}&{}{\mathbf {B}}\\ {\mathbf {C}}&{}{\mathbf {D}} \end{matrix}}\right] \) by \(\left[ {\begin{matrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t \end{matrix}}\right] \). However, as Proposition 6.3 is not a core result of our theory, we illustrate how some continuous-time results can be imported from the discrete-time case by discretization using lifting of the input and output signals, combined with sampling of the state, as described in [31,  §2.4].

Proof of Proposition 6.3

That (6.5) is a correct statement of the spatial KYP inequality for \(\Sigma ^d\), with solution denoted by \(H^{-\frac{1}{2}}\) instead of by \(H^{\frac{1}{2}}\), follows from Lemma 2.5, the unitarity of \(\left[ {\begin{matrix}1_X&{}0\\ 0&{}\Lambda ^t_K \end{matrix}}\right] \) and the fact that \(\left[ {\begin{matrix}H^{\frac{1}{2}}&{}0\\ 0&{}1 \end{matrix}}\right] \) commutes with \(\left[ {\begin{matrix}1&{}0\\ 0&{}(\Lambda ^t_K)^* \end{matrix}}\right] \).

Now let H be a solution to the spatial KYP equality in the sense of Theorem 1.9 and fix \(t>0\) arbitrarily. Then H is also a solution to the spatial KYP inequality for the discrete time system \(\left[ {\begin{matrix}{\mathbf {A}}&{}{\mathbf {B}}\\ {\mathbf {C}}&{}{\mathbf {D}} \end{matrix}}\right] :=\left[ {\begin{matrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t \end{matrix}}\right] \) with input space \(L^2([0,T];U)\), state space X and output space \(L^2([0,T];Y)\), in the sense of [10,  Theorem 1.3]. Then [10,  Proposition 5.3] gives that \(H^{-1}\) is a solution to the spatial KYP inequality for the discrete-time system \(\left[ {\begin{matrix}{\mathbf {A}}&{}{\mathbf {B}}\\ {\mathbf {C}}&{}{\mathbf {D}} \end{matrix}}\right] ^*:=\left[ {\begin{matrix}{\mathfrak {A}}^{t*}&{}{\mathfrak {C}}^{t*}\\ {\mathfrak {B}}^{t*}&{}{\mathfrak {D}}^{t*} \end{matrix}}\right] \), so that (6.5) holds. Since \(t>0\) was arbitrary, we obtain the result. \(\square \)

In Proposition 6.1 we proved that the existence of a storage function implies that the transfer function coincides with a Schur function on some right half-plane. In order to prove the converse implication, we now introduce the available storage

$$\begin{aligned} S_a(x_0):=\sup _{{{\mathbf {v}}}\in L^{2+}_{loc,U},\, t>0} \left( \Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert ^2_{L^{2+}_Y} - \Vert \pi _{[0,t]}{{\mathbf {v}}}\Vert ^2_{L^{2+}_U} \right) ,\quad x_0\in X, \end{aligned}$$
(6.6)

where in the supremum, \({{\mathbf {y}}}\) is the output signal of the trajectory on \({{{\mathbb {R}}}^{+}}\) of \(\Sigma \), with input \({{\mathbf {v}}}\) and initial state \(x_0\), as well as the required supply

$$\begin{aligned} S_r(x_0):=\inf _{({{\mathbf {v}}},{{\mathbf {y}}},t)\in {\mathfrak {V}}_{x_0}} \left( \Vert \pi _{[t,0]} {{\mathbf {v}}}\Vert ^2_{L^{2-}_U} - \Vert \pi _{[t,0]} {{\mathbf {y}}}\Vert ^2_{L^{2-}_Y}\right) ,\quad x_0\in X, \end{aligned}$$
(6.7)

where

$$\begin{aligned} {\mathfrak {V}}_{x_0}:=\left\{ ({{\mathbf {v}}},{{\mathbf {y}}},t)\in L^2_{\ell ,loc,U\times Y}\times {{{\mathbb {R}}}^{-}}\biggm \vert \begin{array}{l} ({{\mathbf {v}}},{{\mathbf {x}}},{{\mathbf {y}}})\text { is a trajectory of }\Sigma \text { on }{{\mathbb {R}}},\\ {{\mathbf {x}}}(0)=x_0,~{\text {supp}}(\pi _-{{\mathbf {v}}})\subset [t,0] \end{array}\right\} . \end{aligned}$$

We need the following lemma in order to prove that \(S_a\) and \(S_r\) are storage functions if \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}\).

Lemma 6.4

Let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be a trajectory on \({{{\mathbb {R}}}^{+}}\) with \({{\mathbf {x}}}(0)=0\), of a system \(\Sigma \) whose transfer function is in \({{\mathcal {S}}}_{U,Y}\). Then

$$\begin{aligned} \Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert ^2_{L^{2+}_Y} \leqslant \Vert \pi _{[0,t]}{{\mathbf {u}}}\Vert ^2_{L^{2+}_U},\quad t>0. \end{aligned}$$

Proof

By Theorem 3.4, the operator \(L_\Sigma \) in (3.8) is a contraction from \(L^2_U\) into \(L^2_Y\), such that \(L_\Sigma {{\mathbf {u}}}={\mathfrak {D}}{{\mathbf {u}}}\) for all \({{\mathbf {u}}}\in L^{2+}_{U}\). By (2.3), \({{\mathbf {y}}}={\mathfrak {D}}{{\mathbf {u}}}\), so that item (4) of Definition 2.1 gives

$$\begin{aligned} \begin{aligned} \pi _{[0,t]}{{\mathbf {y}}}&=\pi _{[0,t]}{\mathfrak {D}}\pi _{[0,t]}{{\mathbf {u}}}+\pi _{[0,t]}{\mathfrak {D}}\pi _{(t,\infty )}{{\mathbf {u}}}=\pi _{[0,t]}L_\Sigma \pi _{[0,t]}{{\mathbf {u}}}+\pi _{[0,t]}\tau ^{-t}{\mathfrak {D}}\tau ^t \pi _{(t,\infty )}{{\mathbf {u}}}\\&=\pi _{[0,t]}L_\Sigma \pi _{[0,t]}{{\mathbf {u}}}+\tau ^{-t}\pi _{[-t,0]}{\mathfrak {D}}\pi _+\tau ^t {{\mathbf {u}}}=\pi _{[0,t]}L_\Sigma \pi _{[0,t]}{{\mathbf {u}}}, \end{aligned} \end{aligned}$$

and then \(\Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert =\Vert \pi _{[0,t]} L_\Sigma \pi _{[0,t]}{{\mathbf {u}}}\Vert \leqslant \Vert \pi _{[0,t]}{{\mathbf {u}}}\Vert \). \(\square \)

In the next result, we do not assume minimality, in contrast to many similar results in the literature.

Theorem 6.5

Assume that the well-posed system \(\Sigma \) has transfer function in \({{\mathcal {S}}}_{U,Y}\). Then \(S_a\) and \(S_r\) are storage functions for \(\Sigma \), which are extremal in the sense that every other storage function S for \(\Sigma \) satisfies

$$\begin{aligned} S_a(x_0)\leqslant S(x_0)\leqslant S_r(x_0),\quad x_0\in X. \end{aligned}$$
(6.8)

Proof

Step 1: \(S_a\) is a storage function for \(\Sigma \). Choose \({{\mathbf {v}}}=0\) in (6.6) to obtain that \(S_a(x_0)\geqslant 0\) for all \(x_0\in X\). On the other hand, by Lemma 6.4, \(\Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert -\Vert \pi _{[0,t]}{{\mathbf {v}}}\Vert \leqslant 0\) for all trajectories \(({{\mathbf {v}}},{{\mathbf {x}}},{{\mathbf {y}}})\) on \({{\mathbb {R}}}^+\) with \({{\mathbf {v}}}\in L^{2+}_{loc,U}\) and \({{\mathbf {x}}}(0)=0\), and all \(t>0\). Thus \(S_a(0)=0\).

Let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be a system trajectory of \(\Sigma \) over \({{\mathbb {R}}}^+\) and fix \(t>0\). Let \({{\mathbf {v}}}\in L^{2+}_{loc,U}\) and write \({{\mathbf {x}}}_{{\mathbf {v}}}\) and \({{\mathbf {y}}}_{{\mathbf {v}}}\) for the state and output trajectory on \({{{\mathbb {R}}}^{+}}\) of \(\Sigma \) corresponding to the input \({{\mathbf {v}}}\) and initial state \({{\mathbf {x}}}_{{\mathbf {v}}}(0)={{\mathbf {x}}}(t)\). Define

$$\begin{aligned} (\widetilde{{{\mathbf {v}}}},\widetilde{{{\mathbf {x}}}},\widetilde{{{\mathbf {y}}}}):= \pi _{[0,t)}({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}}) +\tau ^{-t} ({{\mathbf {v}}},{{\mathbf {x}}}_{{\mathbf {v}}},{{\mathbf {y}}}_{{\mathbf {v}}}). \end{aligned}$$

Since \({{\mathbf {x}}}_{{\mathbf {v}}}(0)={{\mathbf {x}}}(t)\), trajectory property (4) listed after Definition 2.2 gives that \((\widetilde{{{\mathbf {v}}}},\widetilde{{{\mathbf {x}}}},\widetilde{{{\mathbf {y}}}})\) is also a trajectory of \(\Sigma \) over \({{\mathbb {R}}}^+\) with \(\widetilde{{{\mathbf {x}}}}(0)={{\mathbf {x}}}(0)\). For every \(s>0\), using (6.6), we now have

$$\begin{aligned}&\Vert \pi _{[0,s]}{{\mathbf {y}}}_{{\mathbf {v}}}\Vert ^2_{L^{2+}_Y} - \Vert \pi _{[0,s]} {{\mathbf {v}}}\Vert ^2_{L^{2+}_U} \\&\quad = \Vert \pi _{[t,t+s]} \tau ^{-t} {{\mathbf {y}}}_{{\mathbf {v}}}\Vert ^2_{L^{2+}_Y} - \Vert \pi _{[t,t+s]} \tau ^{-t} {{\mathbf {v}}}\Vert ^2_{L^{2+}_U}\\&\quad = \Vert \pi _{[0,t+s]} \widetilde{{{\mathbf {y}}}}\Vert ^2_{L^{2+}_Y} - \Vert \pi _{[0,t+s]} \widetilde{{{\mathbf {v}}}} \Vert ^2_{L^{2+}_U} - \Vert \pi _{[0,t]} {{\mathbf {y}}}\Vert ^2_{L^{2+}_Y} + \Vert \pi _{[0,t]} {{\mathbf {u}}}\Vert ^2_{L^{2+}_U}\\&\quad \leqslant S_a({{\mathbf {x}}}(0)) + \int _0^t\Vert {{\mathbf {u}}}(\tau )\Vert _U^2\,{\mathrm {d}}\tau -\int _0^t\Vert {{\mathbf {y}}}(\tau )\Vert _Y^2\,{\mathrm {d}}s. \end{aligned}$$

Taking supremum over \({{\mathbf {v}}}\in L^{2+}_{loc,U}\) and \(s>0\) it follows that \(S_a\) satisfies (6.1).

Step 2: \(S_r\) is a storage function for \(\Sigma \). For \(x_0\not \in {\text {ran}}({\mathfrak {B}})\) it follows from (2.4) that \({\mathfrak {V}}_{x_0}=\emptyset \), so that \(S_r(x_0)=\inf \emptyset =\infty \geqslant 0\). Now assume that \(x_0\in {\text {ran}}({\mathfrak {B}})\) and choose \({{\mathbf {v}}}\in L^{2}_{\ell , loc,U}\) with \({\mathfrak {B}}\pi _- {{\mathbf {v}}}=x_0\) and \(t<0\) with \({\text {supp}}(\pi _-{{\mathbf {v}}})\subset [t,0]\) arbitrarily. Let \(({{\mathbf {v}}},{{\mathbf {x}}}_{{\mathbf {v}}},{{\mathbf {y}}}_{{\mathbf {v}}})\) be the associated trajectory of \(\Sigma \) on \({{\mathbb {R}}}\), so that \({{\mathbf {x}}}_{{\mathbf {v}}}(t)={\mathfrak {B}}\pi _- \tau ^t {{\mathbf {v}}}=0\) and \({{\mathbf {x}}}_{{\mathbf {v}}}(0)={\mathfrak {B}}\pi _- {{\mathbf {v}}}=x_0\). By trajectory property (2), \(\tau ^t({{\mathbf {v}}}, {{\mathbf {x}}}_{{\mathbf {v}}},{{\mathbf {y}}}_{{\mathbf {v}}})\) is a trajectory of \(\Sigma \) on \({{{\mathbb {R}}}^{+}}\) with \((\tau ^t {{\mathbf {x}}}_{{\mathbf {v}}})(0)={{\mathbf {x}}}_{{\mathbf {v}}}(t)=0\). Then Lemma 6.4 gives that

$$\begin{aligned} \Vert \pi _{[t,0]} {{\mathbf {y}}}_{{\mathbf {v}}}\Vert _{L^{2-}_Y} =\Vert \pi _{[0,-t]} \tau ^t {{\mathbf {y}}}_{{\mathbf {v}}}\Vert _{L^{2+}_Y} \leqslant \Vert \pi _{[0,-t]}\tau ^t {{\mathbf {v}}}\Vert _{L^{2+}_U} =\Vert \pi _{[t,0]}{{\mathbf {v}}}\Vert _{L^{2-}_U}, \end{aligned}$$

that is,

$$\begin{aligned} \Vert \pi _{[t,0]}{{\mathbf {v}}}\Vert _{L^{2-}_U}-\Vert \pi _{[t,0]} {{\mathbf {y}}}_{{\mathbf {v}}}\Vert _{L^{2-}_Y}\geqslant 0. \end{aligned}$$

Taking the infimum over all pairs \(({{\mathbf {v}}},t)\in L^2_{\ell ,loc,U}\times {{{\mathbb {R}}}^{-}}\) with \({\mathfrak {B}}\pi _-{{\mathbf {v}}}=x_0\) and \({\text {supp}}(\pi _-{{\mathbf {v}}})\subset [t,0]\), we conclude that \(S_r(x_0)\geqslant 0\). For \(x_0=0\), we may make the particular choice \({{\mathbf {v}}}=0\) in (6.7), in order to get \(S_r(0)\leqslant 0-0=0\).

To see that \(S_r\) satisfies (6.1), we give a similar argument as in Step 1. Let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be a system trajectory of \(\Sigma \) over \({{\mathbb {R}}}^+\) and fix \(t>0\). If \({{\mathbf {x}}}(0)\not \in {\text {ran}}({\mathfrak {B}})\), then \(S_r({{\mathbf {x}}}(0))=\inf \emptyset =\infty \), and hence (6.1) is satisfied. Now assume that \({{\mathbf {x}}}(0)\in {\text {ran}}({\mathfrak {B}})\), say with \({{\mathbf {x}}}(0)={\mathfrak {B}}{{\mathbf {v}}}_0\). Then \({\text {supp}}({{\mathbf {v}}}_0)\subset [s,0]\) for some \(s<0\) and we let \(({{\mathbf {v}}},{{\mathbf {x}}}_{{\mathbf {v}}},{{\mathbf {y}}}_{{\mathbf {v}}})\) be an arbitrary trajectory of \(\Sigma \) over \({{\mathbb {R}}}\) with \(\pi _- {{\mathbf {v}}}={{\mathbf {v}}}_0\); then also \({{\mathbf {x}}}_{{\mathbf {v}}}(0)={\mathfrak {B}}\pi _-{{\mathbf {v}}}={{\mathbf {x}}}(0)\). Define

$$\begin{aligned} (\widetilde{{{\mathbf {v}}}},\widetilde{{{\mathbf {x}}}},\widetilde{{{\mathbf {y}}}}):=\tau ^t \pi _- ({{\mathbf {v}}},{{\mathbf {x}}}_{{\mathbf {v}}},{{\mathbf {y}}}_{{\mathbf {v}}}) + \tau ^{t} ({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}}). \end{aligned}$$

Using that \({{\mathbf {x}}}_{{\mathbf {v}}}(0)={{\mathbf {x}}}(0)\), we obtain from trajectory properties (5) and (3) that \((\widetilde{{{\mathbf {v}}}},\widetilde{{{\mathbf {x}}}},\widetilde{{{\mathbf {y}}}})\) is a trajectory of \(\Sigma \) over \({{\mathbb {R}}}\) with \({\text {supp}}(\pi _-{\widetilde{{{\mathbf {v}}}}})\subset [s-t,0]\) and \(\widetilde{{{\mathbf {x}}}}(0)={{\mathbf {x}}}(t)\). Then we have from (6.7) that

$$\begin{aligned}&\Vert \pi _{[s,0]} {{\mathbf {v}}}\Vert ^2_{L^{2-}_U} - \Vert \pi _{[s,0]}{{\mathbf {y}}}_{{\mathbf {v}}}\Vert ^2_{L^{2-}_Y}\\&\quad = \Vert \pi _{[s-t,-t]}\tau ^t {{\mathbf {v}}}\Vert ^2_{L^{2-}_U} - \Vert \pi _{[s-t,-t]} \tau ^t {{\mathbf {y}}}_{{\mathbf {v}}}\Vert ^2_{L^{2-}_Y}\\&\quad = \Vert \pi _{[s-t,0]}\widetilde{{{\mathbf {v}}}} \Vert ^2_{L^{2-}_U} - \Vert \pi _{[s-t,0]}\widetilde{{{\mathbf {y}}}}\Vert ^2_{L^{2-}_Y} -\Vert \pi _{[-t,0]}\tau ^t {{\mathbf {u}}}\Vert ^2_{L^{2-}_U} + \Vert \pi _{[-t,0]} \tau ^t {{\mathbf {y}}}\Vert ^2_{L^{2-}_Y}\\&\quad \geqslant S_r({{\mathbf {x}}}(t)) -\int _0^t\Vert {{\mathbf {u}}}(\tau )\Vert _U^2\,{\mathrm {d}}\tau +\int _0^t\Vert {{\mathbf {y}}}(\tau )\Vert _Y^2\,{\mathrm {d}}\tau . \end{aligned}$$

Taking the infimum over all \(({{\mathbf {v}}},{{\mathbf {y}}}_{{\mathbf {v}}},s)\in {\mathfrak {V}}_{x_0}\), we obtain that (6.1) holds for \(S=S_r\). Hence \(S_r\) is a storage function.

Step 3: Every storage function S for \(\Sigma \) satisfies \(S_a\leqslant S\leqslant S_r\). Let S be an arbitrary storage function for \(\Sigma \) and choose \(x_0\in X\). If \(S(x_0)=\infty \), then certainly \(S_a(x_0)\leqslant S(x_0)\). Hence assume \(S(x_0)<\infty \). Now let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be an arbitrary trajectory of \(\Sigma \) on \({{\mathbb {R}}}^+\) with \({{\mathbf {x}}}(0)=x_0\) and fix a \(t>0\). Since \(S({{\mathbf {x}}}(0))=S(x_0)<\infty \), by (6.1) we obtain that \(S({{\mathbf {x}}}(t))<\infty \). Reordering (6.1), we obtain that

$$\begin{aligned} \Vert \pi _{[0,t]}{{\mathbf {y}}}\Vert _{L^2_Y}^2 - \Vert \pi _{[0,t]}{{\mathbf {u}}}\Vert _{L^2_U}^2 \leqslant S({{\mathbf {x}}}(0))-S({{\mathbf {x}}}(t))\leqslant S(x_0). \end{aligned}$$

Taking the supremum over all trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) on \({{\mathbb {R}}}^+\) with \({{\mathbf {x}}}(0)=x_0\) and all \(t>0\), we obtain that \(S_a(x_0)\leqslant S(x_0)\). Hence \(S_a(x_0)\leqslant S(x_0)\) for all \(x_0\in X\).

Now we turn to the inequality for \(S_r\). If \(x_0\not \in {\text {ran}}({\mathfrak {B}})\), then \(S_r(x_0)=\infty \), and we clearly have \(S(x_0)\leqslant S_r(x_0)\). Hence, assume that \(x_0\in {\text {ran}}({\mathfrak {B}})\) and let \({{\mathbf {u}}}\in L^{2-}_{\ell ,U}\) be such that \(x_0={\mathfrak {B}}{{\mathbf {u}}}\). Let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be the uniquely determined trajectory for \(\Sigma \) over \({{\mathbb {R}}}\) with input \({{\mathbf {u}}}\), and fix \(t<0\) such that \({\text {supp}}({{\mathbf {u}}})\subset [t,0]\). Since \({{\mathbf {x}}}(t)={\mathfrak {B}}\pi _-\tau ^t{{\mathbf {u}}}={\mathfrak {B}}0=0\), trajectory properties (1) and (2) give that

$$\begin{aligned} (\widetilde{{{\mathbf {u}}}},\widetilde{{{\mathbf {x}}}},\widetilde{{{\mathbf {y}}}}):=\pi _+ \tau ^t({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}}) \end{aligned}$$

is a trajectory of \(\Sigma \) over \({{\mathbb {R}}}^+\), with \(\widetilde{{{\mathbf {x}}}}(0)=0\) and \(\widetilde{{{\mathbf {x}}}}(-t)=x(0)=x_0\). Hence \(S(\widetilde{{{\mathbf {x}}}}(0))=0\). By (6.1), we then have

$$\begin{aligned} \Vert \pi _{[t,0]} {{\mathbf {u}}}\Vert ^2_{L^{2-}_U} - \Vert \pi _{[t,0]} {{\mathbf {y}}}\Vert ^2_{L^{2-}_Y}&= \Vert \pi _{[0,-t]} \tau ^t {{\mathbf {u}}}\Vert ^2_{L^{2+}_U} - \Vert \pi _{[0,-t]} \tau ^t {{\mathbf {y}}}\Vert ^2_{L^{2+}_Y}\\&= \Vert \pi _{[0,-t]} \widetilde{{{\mathbf {u}}}} \Vert ^2_{L^{2+}_U} - \Vert \pi _{[0,-t]} \widetilde{{{\mathbf {y}}}} \Vert ^2_{L^{2+}_Y}\\&\geqslant S(\widetilde{{{\mathbf {x}}}}(-t)) = S(x_0). \end{aligned}$$

Now, in the left hand side of the inequality, take the infimum over all trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) on \({{\mathbb {R}}}\) such that \({{\mathbf {x}}}(0)=x_0\), and all t such that \({\text {supp}}(\pi _-{{\mathbf {u}}})\subset [t,0]\). It then follows that \(S_r(x_0)\geqslant S(x_0)\). \(\square \)

Combining Proposition 6.1 and Theorem 6.5, we get the following corollary.

Corollary 6.6

The transfer function of a well-posed system \(\Sigma \) has an analytic continuation in the Schur class if and only if \(\Sigma \) has a storage function.

Next we derive more explicit formulas for \(S_a\) and \(S_r\), in terms of the operators constituting \(\Sigma \), and we determine quadratic storage functions for \(\Sigma \), leading to, in general unbounded, solutions to the KYP inequality for \(\Sigma \). For this purpose, assume \(\widehat{{\mathfrak {D}}}|_{{\mathbb {C}}^+\bigcap {\text {dom}}( \widehat{{\mathfrak {D}}})}\) has an analytic continuation to a function in \({{\mathcal {S}}}_{U,Y}\). By item (1) of Theorem 3.4, the operator \(L_\Sigma \) in (3.8) decomposes as

$$\begin{aligned} L_\Sigma =\begin{bmatrix}\widetilde{{\mathfrak {T}}}_\Sigma &{} 0\\ {\mathfrak {H}}_\Sigma &{} {\mathfrak {T}}_\Sigma \end{bmatrix}:\begin{bmatrix}L^{2-}_U\\ L^{2+}_U\end{bmatrix} \rightarrow \begin{bmatrix}L^{2-}_Y\\ L^{2+}_Y\end{bmatrix}, \end{aligned}$$
(6.9)

with \({\mathfrak {H}}_\Sigma \) the \(L^2\)-Hankel operator of (3.9). Since \(\widehat{{\mathfrak {D}}}\in {{\mathcal {S}}}_{U,Y}\), we have \(\Vert L_\Sigma \Vert =\Vert M_{\widehat{{\mathfrak {D}}}}\Vert =\Vert \widehat{{\mathfrak {D}}}\Vert _\infty \leqslant 1\). Hence, also \(\widetilde{{\mathfrak {T}}}_\Sigma \), \({\mathfrak {H}}_\Sigma \) and \({\mathfrak {T}}_\Sigma \) are contractions. In the statement of the lemma, the reader should recall the notation \(D_T: = (I - T^* T)^{\frac{1}{2}}\) used to denote the defect operator of a Hilbert-space contraction operator T, as defined at the end of §1.

Lemma 6.7

Let \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) be a well-posed system, such that \(\widehat{{\mathfrak {D}}}\in {{\mathcal {S}}}_{U,Y}\). Define \({\mathbf {W}}_o\) as in §3 and decompose \(L_\Sigma \) in (3.8) as in (6.9). Then

$$\begin{aligned} S_a(x_0)&= \sup _{{{\mathbf {u}}}\in L^{2+}_U} \Vert {\mathbf {W}}_o x_0 + {\mathfrak {T}}_\Sigma {{\mathbf {u}}}\Vert ^2_{L^{2+}_Y} - \Vert {{\mathbf {u}}}\Vert ^2_{L^{2+}_U},\quad x_0\in {\text {dom}}({\mathbf {W}}_o), \end{aligned}$$
(6.10)
$$\begin{aligned} S_r(x_0)&= \inf _{{{\mathbf {u}}}\in L^{2-}_{\ell ,U}, x_0={\mathfrak {B}}{{\mathbf {u}}}} \Vert D_{{\widetilde{{\mathfrak {T}}}}_\Sigma } {{\mathbf {u}}}\Vert ^2_{L^{2-}_{U}}, \quad x_0\in X , \end{aligned}$$
(6.11)

and \(S_a(x_0)=\infty \) in case \(x_0\not \in {\text {dom}}({\mathbf {W}}_o)\). Finally, \(S_r(x_0)<\infty \) if and only if \(x_0\in \text{ Rea }\,(\Sigma )={\text {ran}}({\mathfrak {B}})\).

Note that for each \(x_0\in X\), formula (6.10) exhibits \(S_a(x_0)\) as the norm squared of \({\mathbf {W}}_ox_0\) in the Brangesian complement of the space \({\text {ran}}({\mathfrak {T}}_\Sigma )\); see the notes to Chapter I of [29], or [6,  §3].

Proof of Lemma 6.7

We start with \(S_a\). Using (6.6), (2.3) and (2.1), it follows that

$$\begin{aligned} S_a(x_0)=\sup _{{{\mathbf {v}}}\in L^{2+}_{loc,U},\, t>0} \left( \Vert {\mathfrak {C}}^t x_0+{\mathfrak {D}}^t {{\mathbf {v}}}\Vert ^2_{L^{2+}_Y} - \Vert \pi _{[0,t]}{{\mathbf {v}}}\Vert ^2_{L^{2+}_U} \right) ,\quad x_0\in X. \end{aligned}$$

In case \(x_0\not \in {\text {dom}}({\mathbf {W}}_o)\), we have \({\mathfrak {C}}x_0\not \in L^{2+}_Y\), and fixing \({{\mathbf {v}}}=0\) in the preceding supremum, we see that

$$\begin{aligned} S_a(x_0)\geqslant \sup _{t>0}\Vert {\mathfrak {C}}^t x_0\Vert ^2_{L^{2+}_Y}=\sup _{t>0}\Vert \pi _{[0,t]}{\mathfrak {C}}x_0\Vert ^2_{L^{2+}_Y}=\infty . \end{aligned}$$

Now take \(x_0\in {\text {dom}}({\mathbf {W}}_o)\). Then \({\mathfrak {C}}^t x_0 =\pi _{[0,t]}{\mathbf {W}}_o x_0\). For now, fix \(t>0\) and \({{\mathbf {v}}}\in L^{2+}_{loc,U}\). Combining the causality and time-invariance of \({\mathfrak {D}}\), see item (4) of Definition 2.1, it follows that \(\pi _{[0,t]}{\mathfrak {D}}=\pi _{[0,t]}{\mathfrak {D}}\pi _{(-\infty ,t]}\). By Theorem 3.4 and because \({\text {supp}}({{\mathbf {v}}})\subset [0,\infty )\), we have \({\mathfrak {D}}^t {{\mathbf {v}}}=\pi _{[0,t]} {\mathfrak {D}}\pi _{[0,t]}{{\mathbf {v}}}=\pi _{[0,t]} L_\Sigma \pi _{[0,t]} {{\mathbf {v}}}= \pi _{[0,t]} {\mathfrak {T}}_\Sigma \pi _{[0,t]} {{\mathbf {v}}}\). Thus \(S_a\) can be written as

$$\begin{aligned} S_a(x_0)=\sup _{{{\mathbf {v}}}\in L^{2+}_{loc,U},\, t>0} \left( \Vert \pi _{[0,t]} ({\mathbf {W}}_o x_0+ {\mathfrak {T}}_\Sigma \pi _{[0,t]}{{\mathbf {v}}}) \Vert ^2_{L^{2+}_Y} - \Vert \pi _{[0,t]}{{\mathbf {v}}}\Vert ^2_{L^{2+}_U} \right) . \end{aligned}$$

Next we show that \(\pi _{[0,t]}\) can be removed everywhere in the right hand side. Set \({{\mathbf {w}}}:=\pi _{[0,t]}{{\mathbf {v}}}\in L^{2+}_U\), so that

$$\begin{aligned}&\Vert \pi _{[0,t]} ({\mathbf {W}}_o x_0+ {\mathfrak {T}}_\Sigma \pi _{[0,t]}{{\mathbf {v}}}) \Vert ^2_{L^{2+}_Y} - \Vert \pi _{[0,t]}{{\mathbf {v}}}\Vert ^2_{L^{2+}_U}\\&\quad =\Vert \pi _{[0,t]} ({\mathbf {W}}_o x_0+ {\mathfrak {T}}_\Sigma {{\mathbf {w}}}) \Vert ^2_{L^{2+}_Y} - \Vert {{\mathbf {w}}}\Vert ^2_{L^{2+}_U} \leqslant \Vert {\mathbf {W}}_o x_0+ {\mathfrak {T}}_\Sigma {{\mathbf {w}}}\Vert ^2_{L^{2+}_Y} - \Vert {{\mathbf {w}}}\Vert ^2_{L^{2+}_U}. \end{aligned}$$

It follows that \(S_a(x_0)\) is dominated by the right-hand side of (6.10), and equality is approached as \(t\rightarrow \infty \). Thus (6.10) holds.

Now we turn to the proof of the formula for \(S_r\). If \(x_0 \notin {\text {Rea}}(\Sigma ) = {\text {ran}}({\mathfrak {B}})\), then \({\mathfrak {V}}_{x_0} = \emptyset \) \(S_r(x_0) = \infty \) in (6.7) as in Step 3 in the proof of Theorem 6.5 and in this case (6.11) is correct. Next suppose that \(x_0 \in {\text {Rea}}(\Sigma ) = {\text {ran}}({\mathfrak {B}})\) so the set \({\mathfrak {V}}_{x_0} \ne \emptyset \). Let \(({{\mathbf {v}}}, {{\mathbf {y}}},t)\) be an arbitrary element of \({\mathfrak {V}}_{x_0}\). Thus \({\text {supp}}(\pi _- {{\mathbf {v}}}) \subset [t, 0]\), \(({{\mathbf {v}}}, {{\mathbf {y}}})\) embeds into a system trajectory \(({{\mathbf {v}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) on \({{\mathbb {R}}}\) such that \({{\mathbf {x}}}(0) = x_0\).

By (2.4), combined with the causality and time-invariance of \({\mathfrak {D}}\), we have

$$\begin{aligned} \pi _-{{\mathbf {y}}}=\pi _- {\mathfrak {D}}{{\mathbf {v}}}=\pi _- {\mathfrak {D}}\pi _- {{\mathbf {v}}}=\pi _- {\mathfrak {D}}\pi _{[t,0]} {{\mathbf {v}}}=\pi _{[t,0]}{\mathfrak {D}}\pi _{[t,0]}{{\mathbf {v}}}=\pi _{[t,0]}{{\mathbf {y}}}. \end{aligned}$$

In particular, the value of \(\Vert \pi _- {{\mathbf {v}}}\Vert ^2- \Vert \pi _- {{\mathbf {y}}}\Vert ^2=\Vert \pi _{[t,0]}{{\mathbf {v}}}\Vert ^2- \Vert \pi _{[t,0]}{{\mathbf {y}}}\Vert ^2\) only depends on \({{\mathbf {u}}}:=\pi _- {{\mathbf {v}}}\in L^{2-}_{\ell ,U}\), and thus we may assume without loss of generality that \({{\mathbf {v}}}\in L^{2-}_{\ell ,U}\). In that case, Theorem 3.4 shows that \({{\mathbf {y}}}={\mathfrak {D}}{{\mathbf {u}}}= L_\Sigma {{\mathbf {u}}}\) and by (6.9) we have \(\pi _- {{\mathbf {y}}}= \pi _- {\mathfrak {D}}\pi _- {{\mathbf {u}}}= \widetilde{{\mathfrak {T}}}_\Sigma \pi _-{{\mathbf {v}}}\). Thus

$$\begin{aligned} \Vert \pi _{[t,0]} {{\mathbf {v}}}\Vert ^2_{L^{2-}_U} - \Vert \pi _{[t,0]} {{\mathbf {y}}}\Vert ^2_{L^{2-}_Y} =\Vert \pi _- {{\mathbf {v}}}\Vert ^2_{L^{2-}_U} - \Vert \widetilde{{\mathfrak {T}}}_\Sigma \pi _-{{\mathbf {v}}}\Vert ^2_{L^{2-}_Y} =\Vert D_{\widetilde{{\mathfrak {T}}}_\Sigma } \pi _- {{\mathbf {v}}}\Vert ^2_{L^{2-}_U}.\nonumber \\ \end{aligned}$$
(6.12)

As \(({{\mathbf {v}}}, {{\mathbf {y}}}, t)\) was chosen to be an arbitrary element of \({\mathfrak {V}}_{x_0}\) and \({{\mathbf {v}}}\in L^{2-}_{\ell ,U}\) satisfies \(x_0={\mathfrak {B}}{{\mathbf {u}}}\), we conclude that \(S_r(x_0)\) (as defined by (6.7)) is greater than or equal to the right-hand side of (6.11).

To conclude that in fact equality holds, just note that starting from \({{\mathbf {u}}}\in L^{2-}_{\ell ,U}\) with \(x_0={\mathfrak {B}}{{\mathbf {u}}}\) one obtains a triple \(({{\mathbf {v}}},{{\mathbf {y}}},t)\) in \({\mathfrak {V}}_{x_0}\) by taking \({{\mathbf {v}}}:={{\mathbf {u}}}\), letting \(t<0\) be such that \({\text {supp}}(\pi _-{{\mathbf {v}}})={\text {supp}}({{\mathbf {u}}})\subset [t,0]\), and defining \({{\mathbf {x}}}\) and \({{\mathbf {y}}}\) by (2.4). Then (6.12) shows that \(S_r(x_0)\) is dominated by the right-hand side of (6.11), and hence the expressions for \(S_r\) are equal, as claimed. \(\square \)

By the preceding analysis, \(S_r(x_0)=\infty \) precisely when \(x_0\not \in \text{ Rea }\,(\Sigma )={\text {ran}}({\mathfrak {B}})\) which in general is a proper subset of \({\text {ran}}({\mathbf {W}}_c)\); hence it is not an \(L^2\)-regular storage function as defined at the beginning of §6. However, assuming that \({\text {dom}}({\mathbf {W}}_c^\bigstar )\) is dense, we can define the following version of \(S_r\):

$$\begin{aligned} {\underline{S}}_r(x_0) := \inf _{{{\mathbf {u}}}\in {\mathbf {W}}_c^{-1}(\{x_0\})} \Vert D_{{\widetilde{{\mathfrak {T}}}}_\Sigma } {{\mathbf {u}}}\Vert ^2_{L^{2-}_{U}}, \quad x_0\in X, \end{aligned}$$
(6.13)

where

$$\begin{aligned} {\mathbf {W}}_c^{-1}(\{x_0\}):=\{{{\mathbf {u}}}\in {\text {dom}}({\mathbf {W}}_c) \mid {\mathbf {W}}_c {{\mathbf {u}}}=x_0\}. \end{aligned}$$

Proposition 6.8

Assume that the well-posed system \(\Sigma \) has transfer function in \({{\mathcal {S}}}_{U,Y}\) and that \({\mathbf {W}}_c^\bigstar \) is densely defined. Then \(S_a\) and \({\underline{S}}_r\) are \(L^2\)-regular storage functions.

Proof

We first prove that \({\underline{S}}_r\) is an \(L^2\)-regular storage function. Clearly \({\underline{S}}_r(x_0)\geqslant 0\) for all \(x_0\in X\). Also, for \(x_0=0\) we can select \({{\mathbf {u}}}:=0\in {\mathbf {W}}_c^{-1}(\{0\})\), obtaining that \({\underline{S}}_r(0)\leqslant \Vert D_{{\widetilde{{\mathfrak {T}}}}_\Sigma } 0\Vert ^2=0\). Hence \({\underline{S}}_r(0)=0\).

Next we prove that \({\underline{S}}_r\) satisfies the energy inequality (6.1). To this end, fix a system trajectory \((\widetilde{{{\mathbf {u}}}},\widetilde{{{\mathbf {x}}}},\widetilde{{{\mathbf {y}}}})\) of \(\Sigma \) over \({{\mathbb {R}}}^+\) and a \(t>0\). If \({\widetilde{{{\mathbf {x}}}}}(0)\not \in {\text {ran}}({\mathbf {W}}_c)\) then \({\underline{S}}_r({\widetilde{{{\mathbf {x}}}}}(0))=\inf \emptyset =\infty \) and (6.1) holds; otherwise let \({{\mathbf {u}}}\in {\mathbf {W}}_c^{-1}(\{{\widetilde{{{\mathbf {x}}}}}(0)\})\subset L^{2-}_U\). Then define

$$\begin{aligned} {{\mathbf {u}}}^\circ :=\pi _-\tau ^t({{\mathbf {u}}}+\widetilde{{{\mathbf {u}}}})= \tau ^t({{\mathbf {u}}}+ \pi _{[0,t]}\widetilde{{{\mathbf {u}}}})\in L^{2-}_U, \end{aligned}$$
(6.14)

and note that

$$\begin{aligned} \Vert {{\mathbf {u}}}^\circ \Vert ^2_{L^{2-}_U}=\Vert {{\mathbf {u}}}\Vert ^2_{L^{2-}_U} + \Vert \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}\Vert ^2_{L^{2+}_U}. \end{aligned}$$
(6.15)

We claim that

$$\begin{aligned} \text{(1) } \ {{\mathbf {u}}}^\circ \in {\mathbf {W}}_c^{-1}(\{\widetilde{{{\mathbf {x}}}}(t)\}) \qquad \text{ and } \qquad \text{(2) } \ {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}^\circ =\tau ^t ({\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}+ \pi _{[0,t]}\widetilde{{{\mathbf {y}}}}). \end{aligned}$$
(6.16)

For claim (1), note that item (3) of Proposition 3.2 implies that \(\tau ^t \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}\in L^{2-}_{\ell ,U}\) is in \({\text {dom}}({\mathbf {W}}_c)\) and

$$\begin{aligned} {\mathbf {W}}_c \tau ^t \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}= {\mathfrak {B}}\tau ^t \pi _{[0,t]}\widetilde{{{\mathbf {u}}}} = {\mathfrak {B}}\pi _- \tau ^t \widetilde{{{\mathbf {u}}}}= {\mathfrak {B}}^t \widetilde{{{\mathbf {u}}}}. \end{aligned}$$

Also, item (4) of Proposition 3.2 yields that \(\tau ^t {{\mathbf {u}}}\) is in \({\text {dom}}({\mathbf {W}}_c)\) and \({\mathbf {W}}_c \tau ^t {{\mathbf {u}}}= {\mathfrak {A}}^t {\mathbf {W}}_c {{\mathbf {u}}}= {\mathfrak {A}}^t \widetilde{{{\mathbf {x}}}}(0)\). Therefore we have that \({{\mathbf {u}}}^\circ \in {\text {dom}}({\mathbf {W}}_c)\) and

$$\begin{aligned} {\mathbf {W}}_c {{\mathbf {u}}}^\circ = {\mathbf {W}}_c \tau ^t {{\mathbf {u}}}+ {\mathbf {W}}_c \tau ^t \pi _{[0,t]}\widetilde{{{\mathbf {u}}}} = {\mathfrak {A}}^t \widetilde{{{\mathbf {x}}}}(0) + {\mathfrak {B}}^t \widetilde{{{\mathbf {u}}}} = \widetilde{{{\mathbf {x}}}}(t), \end{aligned}$$

using (2.3) in the last identity. Next we prove claim (2). By item (1) of Theorem 3.4 and (6.9),

$$\begin{aligned} \pi _- L_\Sigma \tau ^t&= \pi _- \tau ^t L_\Sigma = \tau ^t \pi _{(-\infty ,t]} L_\Sigma \\&= \tau ^t (\widetilde{{\mathfrak {T}}}_\Sigma \pi _- + \pi _{[0,t]} {\mathfrak {H}}_\Sigma \pi _- + \pi _{[0,t]} {\mathfrak {T}}_\Sigma \pi _+). \end{aligned}$$

Therefore, from (6.14), we get

$$\begin{aligned} {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}^\circ&= \pi _- L_\Sigma {{\mathbf {u}}}^\circ = \pi _- L_\Sigma \tau ^t ({{\mathbf {u}}}+ \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}) \\&= \tau ^t (\widetilde{{\mathfrak {T}}}_\Sigma {{\mathbf {u}}}+ \pi _{[0,t]} {\mathfrak {H}}_\Sigma {{\mathbf {u}}}+ \pi _{[0,t]} {\mathfrak {T}}_\Sigma \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}), \end{aligned}$$

and furthermore, by (3.12),

$$\begin{aligned} \pi _{[0,t]} {\mathfrak {H}}_\Sigma {{\mathbf {u}}}= \pi _{[0,t]} {\mathbf {W}}_o {\mathbf {W}}_c {{\mathbf {u}}}=\pi _{[0,t]} {\mathbf {W}}_o \widetilde{{{\mathbf {x}}}}(0) = \pi _{[0,t]} {\mathfrak {C}}\widetilde{{{\mathbf {x}}}}(0) = {\mathfrak {C}}^t \widetilde{{{\mathbf {x}}}}(0). \end{aligned}$$

On the other hand, using item (1) of Theorem 3.4 and causality, we obtain

$$\begin{aligned} \pi _{[0,t]} {\mathfrak {T}}_\Sigma \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}= \pi _{[0,t]} {\mathfrak {D}}\pi _{[0,t]}\widetilde{{{\mathbf {u}}}} = {\mathfrak {D}}^t \widetilde{{{\mathbf {u}}}}. \end{aligned}$$

Combining the above computations we find that

$$\begin{aligned} {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}^\circ&= \tau ^t (\widetilde{{\mathfrak {T}}}_\Sigma {{\mathbf {u}}}+ {\mathfrak {C}}^t \widetilde{{{\mathbf {x}}}}(0) + {\mathfrak {D}}^t \widetilde{{{\mathbf {u}}}}) =\tau ^t (\widetilde{{\mathfrak {T}}}_\Sigma {{\mathbf {u}}}+ \pi _{[0,t]} \widetilde{{{\mathbf {y}}}}), \end{aligned}$$

again using (2.3) in the last step. This proves claim (2).

Claim (2) implies that \(\Vert {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}^\circ \Vert _{L^{2-}_Y}^2 =\Vert {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}\Vert _{L^{2-}_Y}^2 +\Vert \pi _{[0,t]}\widetilde{{{\mathbf {y}}}}\Vert ^2_{L^{2+}_Y}\). Combining this with (6.15), we find that

$$\begin{aligned} \Vert {{\mathbf {u}}}^\circ \Vert _{L^{2-}_U}^2-\Vert {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}^\circ \Vert _{L^{2-}_Y}^2=\Vert {{\mathbf {u}}}\Vert _{L^{2-}_U}^2 - \Vert {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}\Vert _{L^{2-}_Y}^2 + \Vert \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}\Vert _{L^{2+}_U}^2 - \Vert \pi _{[0,t]}\widetilde{{{\mathbf {y}}}}\Vert ^2_{L^{2+}_Y}. \end{aligned}$$

By claim (1) in (6.16), \(\pi _-\tau ^t\big ({\mathbf {W}}_c^{-1}({\widetilde{{{\mathbf {x}}}}}(0))+{\widetilde{{{\mathbf {u}}}}})\subset {\mathbf {W}}_c^{-1}({\widetilde{{{\mathbf {x}}}}}(t))\), and so we get that

$$\begin{aligned}&\inf _{{{\mathbf {u}}}^\circ \in {\mathbf {W}}_c^{-1}(\{{\widetilde{{{\mathbf {x}}}}}(t)\})} \Vert {{\mathbf {u}}}^\circ \Vert _{L^{2-}_U}^2-\Vert {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}^\circ \Vert _{L^{2-}_Y}^2\\&\quad \leqslant \inf _{{{\mathbf {u}}}\in {\mathbf {W}}_c^{-1}(\{{\widetilde{{{\mathbf {x}}}}}(0)\})} \Vert {{\mathbf {u}}}\Vert _{L^{2-}_U}^2 - \Vert {\widetilde{{\mathfrak {T}}}}_\Sigma {{\mathbf {u}}}\Vert _{L^{2-}_Y}^2 + \Vert \pi _{[0,t]}\widetilde{{{\mathbf {u}}}}\Vert _{L^{2+}_U}^2 - \Vert \pi _{[0,t]}\widetilde{{{\mathbf {y}}}}\Vert ^2_{L^{2+}_Y}. \end{aligned}$$

This shows that \({\underline{S}}_r\) satisfies the energy inequality (6.1), and hence it is a storage function. We already established that \(S_a\) is a storage function.

The boundedness of \({\underline{S}}_r\) on \({\text {ran}}({{\mathbf {W}}}_c)\) follows from Corollary 6.9 below, and then \(S_a\) is finite on \({\text {ran}}({\mathbf {W}}_c)\), since (6.8) holds with \(S={\underline{S}}_r\). This completes the proof that \({\underline{S}}_r\) is \(L^2\)-regular. \(\square \)

Corollary 6.9

Assume that the well-posed system \(\Sigma \) has a transfer function \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}\) and that \({\mathbf {W}}_c^\star \) is densely defined. Then for all \(x_0 \in X\) we have

$$\begin{aligned} \Vert {\mathbf {W}}_o x_0\Vert _{L^{2+}_Y}^2 \leqslant S_a(x_0)\leqslant {\underline{S}}_r(x_0)\leqslant \inf _{{{\mathbf {u}}}\in {\mathbf {W}}_c^{-1}(\{x_0\})}\Vert {{\mathbf {u}}}\Vert ^2_{L^{2-}_U}, \end{aligned}$$

with \(\Vert {\mathbf {W}}_o x_0\Vert _{L^{2+}_Y}^2\) to be interpreted as \(\infty \) in case \(x_0\not \in {\text {dom}}({\mathbf {W}}_o)\). Moreover, \({\underline{S}}_r(x_0)<\infty \) precisely when \(x_0\in {\text {ran}}({\mathbf {W}}_c)\).

Proof

The first inequality is obtained by selecting \({{\mathbf {u}}}=0\) for the input signals in the supremum in (6.10). The second inequality follows from (6.8), using that \({\underline{S}}_r\) is a storage function for \(\Sigma \) by Proposition 6.8. The final inequality follows from the definition of \({\underline{S}}_r\) in (6.13) and the fact that \(D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }\) is contractive. If \(x_0\not \in {\text {ran}}({\mathbf {W}}_c)\), then the infimum in (6.13) is taken over an empty set, leading to \({\underline{S}}_r(x_0)=\infty \). \(\square \)

We next establish that the storage functions \(S_a\) and \({\underline{S}}_r\) are in fact quadratic.

7 Quadratic Descriptions of \(S_a\) and \({\underline{S}}_r\)

In the sequel, we will need the concept of a core for a closed operator, which we recall here from [27,  p. 256]: the set \(D\subset {\text {dom}}(T)\) is a core for the closed operator T if the operator closure of \(T|_D=T\) equals T, or in words, a closed operator is uniquely determined by its restriction to a core.

In case \(\Sigma \) is a well-posed system whose transfer function \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}\), then the \(L^2\)-transfer map \(L_\Sigma \) in (3.8) is contractive. Hence, with respect to the decomposition in (6.9), we have

$$\begin{aligned} \begin{aligned} I-L_\Sigma L_\Sigma ^*&= \begin{bmatrix} D_{\widetilde{\mathfrak {T}}_\Sigma ^{*}}^{2} &{} - {\widetilde{{\mathfrak {T}}}}_\Sigma {\mathfrak {H}}_\Sigma ^{*} \\ - {\mathfrak {H}}_\Sigma {\widetilde{{\mathfrak {T}}}}_\Sigma ^{*} &{} D_{{\mathfrak {T}}_\Sigma ^{*}}^{2} - {\mathfrak {H}}_\Sigma {\mathfrak {H}}_\Sigma ^{*} \end{bmatrix}\succeq 0;\\ I-L_\Sigma ^* L_\Sigma&= \begin{bmatrix} D_{\widetilde{\mathfrak {T}}_\Sigma }^{2} - {\mathfrak {H}}_\Sigma ^{*} {\mathfrak {H}}_\Sigma &{} -{\mathfrak {H}}_\Sigma ^*{\mathfrak {T}}_\Sigma \\ -{\mathfrak {T}}_\Sigma ^* {\mathfrak {H}}_\Sigma &{} D_{{\mathfrak {T}}_\Sigma }^{2} \end{bmatrix}\succeq 0. \end{aligned} \end{aligned}$$
(7.1)

Since \(L_\Sigma \) is a contraction, so are \({\mathfrak {T}}_\Sigma \), \({\mathfrak {T}}_\Sigma ^{*}\), \({\widetilde{{\mathfrak {T}}}}_\Sigma \) and \({\widetilde{{\mathfrak {T}}}}_\Sigma ^{*}\), and hence their defect operators \(D_{{\mathfrak {T}}_\Sigma }\), \(D_{{\mathfrak {T}}_\Sigma ^{*}}\), \(D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }\) and \(D_{{\widetilde{{\mathfrak {T}}}}_\Sigma ^{*}}\) are well defined. The inequalities in (7.1) imply in particular that

$$\begin{aligned} D_{{\mathfrak {T}}_\Sigma ^{*}}^{2} \succeq {\mathfrak {H}}_\Sigma {\mathfrak {H}}_\Sigma ^{*} \quad \text{ and }\quad D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }^{2} \succeq {\mathfrak {H}}_\Sigma ^{*} {\mathfrak {H}}_\Sigma . \end{aligned}$$

Assuming, in addition, that \(\Sigma \) is minimal, \({\text {ran}}({\mathbf {W}}_c)\) and \({\text {ran}}({\mathbf {W}}_o^*)\) are dense in X, by Corollary 3.5 and items (3) of Propositions 3.2 and 3.1, respectively, so that the factorizations of item (4) in Theorem 3.4 apply:

$$\begin{aligned} {\mathfrak {H}}_\Sigma \big |_{{\text {dom}}({\mathbf {W}}_c )}={\mathbf {W}}_o {\mathbf {W}}_c \quad \text{ and }\quad {\mathfrak {H}}_\Sigma ^*\big |_{{\text {dom}}({\mathbf {W}}_o^* )}={\mathbf {W}}_c^\bigstar {\mathbf {W}}_o^*. \end{aligned}$$

The following lemma follows from Lemma A.1 in Appendix A below, combined with (6.9), (A.1), (3.12) and (A.2):

Lemma 7.1

Assume that the minimal well-posed system \(\Sigma \) has transfer function in \({{\mathcal {S}}}_{U,Y}\). Then:

  1. (1)

    There exists a unique closable operator \(\mathbf{X}_a\) with domain \({\text {ran}}({\mathbf {W}}_c)\subset X\), with range contained in \({\text {ker}}(D_{{\mathfrak {T}}^*_{\Sigma }})^\perp \), and which satisfies the factorization

    $$\begin{aligned} {\mathbf {W}}_o|_{{\text {ran}}({\mathbf {W}}_c)} = D_{{\mathfrak {T}}^*_{\Sigma }} \mathbf{X}_a. \end{aligned}$$
    (7.2)

    Moreover, \({\text {ran}}({\mathbf {W}}_c)\) is a core for the closure \(\overline{\mathbf{X}}_a\) of \(\mathbf{X}_a\), and this closure is injective with range contained in \({\text {ker}}(D_{{\mathfrak {T}}^*_{\Sigma }})^\perp \).

  2. (2)

    There exists a unique closable operator \(\mathbf{X}_r\) with domain \({\text {ran}}({\mathbf {W}}_o^*)\subset X\), range contained in \({\text {ker}}(D_{\widetilde{{\mathfrak {T}}}_{\Sigma }})^\perp \), that satisfies the factorization

    $$\begin{aligned} {\mathbf {W}}_c^*|_{{\text {ran}}({\mathbf {W}}_o^*)} = D_{\widetilde{{\mathfrak {T}}}_{\Sigma }} \mathbf{X}_r. \end{aligned}$$
    (7.3)

    The range of \({\mathbf {W}}_o^*\) is a core for the injective closure \(\overline{\mathbf{X}}_r\) of \(\mathbf{X}_r\) and \({\text {ran}}({\overline{{{\mathbf {X}}}}}_r)\perp {\text {ker}}(D_{{\mathfrak {T}}^*_{\Sigma }})\).

Next we introduce operators \(H_a\) and \(H_r\), which give rise to the quadratic storage functions \(S_{H_a}(x) = \langle H_a x, x \rangle \) and \(S_{H_r}(x) = \langle H_r x, x \rangle \) which are equal to the available storage function \(S_a(x)\) and the \(L^2\)-regularized required supply \({\underline{S}}_r(x)\) respectively, at least for \(x\in {\text {ran}}({\mathbf {W}}_c)\). Assume that \(\Sigma \) is minimal and has transfer function in \({{\mathcal {S}}}_{U,Y}\), so that \(\mathbf{X}_a\) and \(\mathbf{X}_r\) in Lemma 7.1 are densely defined, closable operators with injective closures \({\overline{\mathbf{X}}}_a\) and \({\overline{\mathbf{X}}}_r\), respectively. Then, \({\overline{\mathbf{X}}}_a^*{\overline{\mathbf{X}}}_a\) is selfadjoint with unique positive, selfadjoint, injective square root \(|{\overline{\mathbf{X}}}_a|=({\overline{\mathbf{X}}}_a^*{\overline{\mathbf{X}}}_a)^{\frac{1}{2}}\) satisfying \({\text {dom}}(|{\overline{\mathbf{X}}}_a|)={\text {dom}}({\overline{\mathbf{X}}}_a)\); see for instance [27, §VIII.9]. We now set \(H_a={\overline{\mathbf{X}}}_a^*{\overline{\mathbf{X}}}_a\) so that \(H_a^{\frac{1}{2}}=|{\overline{\mathbf{X}}}_a|\). Analogously, set \(|{\overline{\mathbf{X}}}_r|:=({\overline{\mathbf{X}}}_r^*{\overline{\mathbf{X}}}_r)^{\frac{1}{2}}\) and \(H_r:=({\overline{\mathbf{X}}}_r^*{\overline{\mathbf{X}}}_r)^{-1}\), so that \(H_r^{\frac{1}{2}}=|{\overline{\mathbf{X}}}_r|^{-1}\), with \({\text {dom}}(H_r^{\frac{1}{2}})={\text {ran}}(|\overline{\mathbf{X}}_r|)\). Note that the operators \(H_a^{\frac{1}{2}}\), \(H_a^{-\frac{1}{2}}\), \(H_r^{\frac{1}{2}}\) and \(H_r^{-\frac{1}{2}}\) are all closed. The following theorem follows directly from Theorem A.2 in Appendix A.

Theorem 7.2

Let \(\Sigma \) be a minimal well-posed system which has transfer function in \({{\mathcal {S}}}_{U,Y}\). Define \(\mathbf{X}_a\), \({\overline{\mathbf{X}}}_a\), \(\mathbf{X}_r\), \({\overline{\mathbf{X}}}_r\) as in Lemma 7.1 and \(H_a\) and \(H_r\) as in the preceding paragraph. Then the dense subspace \({\text {ran}}({\mathbf {W}}_c)\) of X is contained in the domains of \(H_a^{\frac{1}{2}}\) and \(H_r^{\frac{1}{2}}\), and \(S_a\) and \({\underline{S}}_r\) satisfy

$$\begin{aligned} \begin{aligned} S_a(x_0)&= \Vert |{\overline{\mathbf{X}}}_a|x_0\Vert ^2=\Vert H_a^{\frac{1}{2}}x_0\Vert ^2,\quad x_0\in {\text {ran}}({\mathbf {W}}_c), \\ {\underline{S}}_r(x_0)&= \Vert |{\overline{\mathbf{X}}}_r|^{-1} x_0\Vert ^2=\Vert H_r^{\frac{1}{2}}x_0\Vert ^2,\quad x_0\in {\text {ran}}({\mathbf {W}}_c). \end{aligned} \end{aligned}$$
(7.4)

Moreover, \({\text {ran}}({\mathbf {W}}_c)\) is a core for \(H_a^{\frac{1}{2}}\) and \({\text {ran}}({\mathbf {W}}_o^*)\) is a core for \(H_r^{-\frac{1}{2}}\).

Note that Theorem 7.2 is not strong enough to justify the conclusion that \(S_a\) and \({\underline{S}}_r\) are quadratic storage functions, since the identities in (7.4) only hold on \({\text {ran}}({\mathbf {W}}_c)\) which might be strictly contained in the domains of \(H_a^{\frac{1}{2}}\) and \(H_r^{\frac{1}{2}}\), respectively. Later on, in Theorem 7.4 below, we will show that \(H_a\) and \(H_r\) are spatial solutions to the KYP inequality of \(\Sigma \) under the assumptions of Theorem 7.2, so that \(H_a\) and \(H_r\) induce quadratic storage functions by Theorem 1.9. These may differ from \(S_a\) and \({\underline{S}}_r\) outside \({\text {ran}}({{\mathbf {W}}}_c)\). However, if the initial state of a trajectory \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) on \({{{\mathbb {R}}}^{+}}\) satisfies \({{\mathbf {x}}}(0)\in {\text {ran}}({{\mathbf {W}}}_c)\), then \({{\mathbf {x}}}(t)\in {\text {ran}}({{\mathbf {W}}}_c)\) for all \(t\geqslant 0\), by items (3) and (4) of Proposition 3.2. For such state trajectories, \(S_a\) and \({\underline{S}}_r\) coincide with \(S_{H_a}\) and \(S_{H_r}\), respectively.

It is of interest to work out the corresponding results for the causal dual system \(\Sigma ^d\) explicitly in terms of objects related to the original system \(\Sigma \). Using (3.10) and (6.9), one gets that the Laurent operator \(L_{\Sigma ^d}\) for \(\Sigma ^d\) is

Furthermore, from (3.5) we see that the dual \(L^2\)-output and dual \(L^2\)-input operators are given by

(7.5)

Apply Lemma 7.1 with \(\Sigma ^d\) in place of \(\Sigma \) to see that the operator \(X_a^d\) obtained from item (1) is determined by

(7.6)

where the last equality can be verified by simply squaring .

On the other hand, by (7.5) and Lemma 7.1 applied to \(\Sigma \) we have

By combining these last two expressions we get that , and since \({\text {ran}}({{\mathbf {X}}}_r)\) is also perpendicular to this kernel, we may conclude that

once we use (7.6) to observe that

By duality, we immediately get , and then the operators \(H_a^d\) and \(H_r^d\) associated with the dual system \(\Sigma ^d\), as in the paragraph preceding Theorem 7.2, are related to \(H_a\) and \(H_r\) via

$$\begin{aligned} H_a^d=H_r^{-1} \quad \text{ and }\quad H_r^d=H_a^{-1}. \end{aligned}$$
(7.7)

Therefore, Theorem 7.2 applied to the causal dual system leads us to the following formulas for the available storage and \(L^2\)-regularized required supply for the causal dual system \(\Sigma ^d\).

Theorem 7.3

Let \(\Sigma \) be a minimal well-posed system which has transfer function in \({{\mathcal {S}}}_{U,Y}\). Define \(\mathbf{X}_a\), \({\overline{\mathbf{X}}}_a\), \(\mathbf{X}_r\), \({\overline{\mathbf{X}}}_r\) as in Lemma 7.1 and \(H_a\) and \(H_r\) as in Theorem 7.2. Then \({\text {ran}}( {{\mathbf {W}}}_o^* )\) is contained in the domains of \(H_a^{-\frac{1}{2}}\) and \(H_r^{-\frac{1}{2}}\), and the available storage \(S^d_a\) and the \(L^2\)-regularized required supply \({\underline{S}}_r^d\) for the causal dual system \(\Sigma ^d\) are given by

$$\begin{aligned} S_a^d(x_0)&= \Vert |\overline{{{\mathbf {X}}}}_a^d| x_0 \Vert ^2 = \Vert |\overline{{{\mathbf {X}}}}_r| x_0 \Vert ^2 = \Vert H_r^{-\frac{1}{2}} x_0 \Vert ^2 \text { for } x_0 \in {\text {ran}}( {{\mathbf {W}}}_o^* ), \\ {\underline{S}}_r^d(x_0)&= \Vert |\overline{{{\mathbf {X}}}}_r^d|^{-1} x_0 \Vert ^2 = \Vert |\overline{{{\mathbf {X}}}}_a|^{-1} x_0 \Vert ^2 = \Vert H_a^{-\frac{1}{2}} x_0 \Vert ^2 \text { for } x_0 \in {\text {ran}}( {{\mathbf {W}}}_o^* ). \end{aligned}$$

Using the above results, we will next show that the solutions \(H_a\) and \(H_r\) to the spatial KYP-inequality (1.13) associated with \(\Sigma \) are minimal and maximal spatial solutions respectively for certain subclasses of spatial solutions.

Theorem 7.4

Let \(\Sigma \) be a minimal well-posed system which has transfer function in \({{\mathcal {S}}}_{U,Y}\). Then the operators \(H_a\) and \(H_r\) defined above are spatial solutions to the KYP-inequality (1.13). Moreover, for all spatial solutions H to (1.13) the following hold:

  1. (1)

    If \({\text {ran}}({\mathbf {W}}_c)\) is a core for \(H^\frac{1}{2}\), then \(H_a \preceq H\);

  2. (2)

    If \({\text {ran}}({\mathbf {W}}_o^*)\) is a core for \(H^{-\frac{1}{2}}\), then \(H \preceq H_r\).

Proof

We first prove the claims regarding \(H_a\). By items (3) and (4) of Proposition 3.2 it follows that

$$\begin{aligned} {\text {ran}}({\mathfrak {B}})\subset {\text {ran}}({\mathbf {W}}_c) \quad \text{ and } \quad {\mathfrak {A}}^t\, {\text {ran}}({\mathbf {W}}_c) \subset {\text {ran}}({\mathbf {W}}_c),\ \ t\in {{\mathbb {R}}}^+. \end{aligned}$$

In particular, Theorem 7.2 yields \({\text {ran}}({\mathfrak {B}})\subset {\text {dom}}(H_a^\frac{1}{2})\), implying \({\mathfrak {B}}^t L^2([0,t];U)\subset {\text {dom}}(H_a^{\frac{1}{2}})\). Moreover, the fact that \(S_a(x)=S_{H_a}(x)\) for a

$$\begin{aligned} \left\| \begin{bmatrix}H_a^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t&{}{\mathfrak {B}}^t\\ {\mathfrak {C}}^t&{}{\mathfrak {D}}^t\end{bmatrix}\begin{bmatrix}x\\ {{\mathbf {u}}}\end{bmatrix}\right\| \leqslant \left\| \begin{bmatrix}H_a^{\frac{1}{2}}&{}0\\ 0&{}I\end{bmatrix}\begin{bmatrix}x\\ {{\mathbf {u}}}\end{bmatrix}\right\| ,\ \begin{bmatrix}x\\ {{\mathbf {u}}}\end{bmatrix}\in \begin{bmatrix}{\text {ran}}({\mathbf {W}}_c)\\ L^2([0,t];U)\end{bmatrix}. \end{aligned}$$
(7.8)

Squaring on both sides and restricting to \({{\mathbf {u}}}=0\), we get

$$\begin{aligned} \Vert H_a^{\frac{1}{2}} {\mathfrak {A}}^t x\Vert ^2 \leqslant \Vert H_a^{\frac{1}{2}} {\mathfrak {A}}^t x\Vert ^2 + \Vert {\mathfrak {C}}^t x\Vert ^2 \leqslant \Vert H_a^{\frac{1}{2}} x\Vert ^2, \quad x\in {\text {ran}}({\mathbf {W}}_c), \end{aligned}$$

hence

$$\begin{aligned} \Vert H_a^{\frac{1}{2}} {\mathfrak {A}}^t x\Vert \leqslant \Vert H_a^{\frac{1}{2}} x\Vert , \quad x\in {\text {ran}}({\mathbf {W}}_c). \end{aligned}$$
(7.9)

Now take \({\widetilde{x}}\in {\text {dom}}(H_a^{\frac{1}{2}})\) and fix \(t\geqslant 0\). Since \({\text {ran}}({\mathbf {W}}_c)\) is a core for \(H_a^{\frac{1}{2}}\) by Theorem 7.2, there exists a sequence \(x_n\in {\text {ran}}({\mathbf {W}}_c)\), \(n\in {{\mathbb {Z}}}_+\), such that \(x_n \rightarrow {\widetilde{x}}\) and \(H_a^{\frac{1}{2}}x_n \rightarrow H_a^{\frac{1}{2}}{\widetilde{x}}\) in X. In particular, \(H_a^{\frac{1}{2}}x_n\) is a Cauchy sequence. Applying (7.9) with \(x=x_n-x_m\), we obtain that

$$\begin{aligned} \Vert H_a^{\frac{1}{2}} {\mathfrak {A}}^t x_n - H_a^{\frac{1}{2}} {\mathfrak {A}}^t x_m\Vert \leqslant \Vert H_a^{\frac{1}{2}} x_n - H_a^{\frac{1}{2}} x_m\Vert \rightarrow 0\quad \text{ as } n,m\rightarrow 0. \end{aligned}$$

Hence \(H_a^{\frac{1}{2}} {\mathfrak {A}}^t x_n\) is also a Cauchy sequence, thus convergent in X. Also, \({\mathfrak {A}}^t x_n\) converges to \({\mathfrak {A}}^t {\widetilde{x}}\), because \({\mathfrak {A}}^t\) is bounded. Since \(H_a^{\frac{1}{2}}\) is closed, it follows that \({\mathfrak {A}}^t {\widetilde{x}}\) is in \({\text {dom}}(H_a^{\frac{1}{2}})\) and \(H_a^{\frac{1}{2}}{\mathfrak {A}}^t {\widetilde{x}}=\lim _{n\rightarrow \infty } H_a^{\frac{1}{2}}{\mathfrak {A}}^t x_n\). In particular, we proved that \({\mathfrak {A}}^t\, {\text {dom}}(H_a^{\frac{1}{2}}) \subset {\text {dom}}(H_a^{\frac{1}{2}})\). We have now proved that (1.12) holds. The fact that the spatial KYP inequality (1.13) holds on \({\text {dom}}(H_a^{\frac{1}{2}}) \oplus L^2([0,t];U)\) now also follows easily from (7.8) and the fact that for \({\widetilde{x}}\in {\text {dom}}(H_a^{\frac{1}{2}})\) and \(x_n\in {\text {ran}}({\mathbf {W}}_c)\) as above we have \(H_a^{\frac{1}{2}} x_n \rightarrow H_a^{\frac{1}{2}} {\widetilde{x}}\), \(H_a^{\frac{1}{2}} {\mathfrak {A}}^t x_n \rightarrow {\mathfrak {A}}^t H_a^{\frac{1}{2}} {\widetilde{x}}\) and \({\mathfrak {C}}^tx_n\rightarrow {\mathfrak {C}}^t{{\widetilde{x}}}\).

Assume next that H is any solution to the spatial KYP-inequality (1.13) with the property that \({\text {ran}}({\mathbf {W}}_c)\) is a core for \(H^\frac{1}{2}\). By Proposition 6.2 and Theorem 6.5, we have

$$\begin{aligned} \Vert H_a^{\frac{1}{2}}x\Vert ^2=S_a(x)\leqslant S_H(x)=\Vert H^{\frac{1}{2}}x\Vert ^2,\quad x\in {\text {ran}}({\mathbf {W}}_c). \end{aligned}$$
(7.10)

Take \({\widetilde{x}}\in {\text {dom}}(H^{\frac{1}{2}})\) arbitrarily, and let \(x_n\in {\text {ran}}({\mathbf {W}}_c)\), \(n\in {{\mathbb {Z}}}_+\), so that \(x_n\rightarrow {\widetilde{x}}\) and \(H^{\frac{1}{2}}x_n \rightarrow H^{\frac{1}{2}}{\widetilde{x}}\); such a sequence exists since \({\text {ran}}({\mathbf {W}}_c)\) is a core for \(H^{\frac{1}{2}}\), by assumption. Reasoning as above, the sequence \(H^{\frac{1}{2}}x_n\), \(n\in {{\mathbb {Z}}}_+\), is a Cauchy sequence, and the inequality (7.10) implies that \(H_a^{\frac{1}{2}}x_n\), \(n\in {{\mathbb {Z}}}_+\), is a Cauchy sequence as well. The closedness of \(H_a^{\frac{1}{2}}\) then implies that \({\widetilde{x}}\in {\text {dom}}(H_a^{\frac{1}{2}})\) and \(H_a^{\frac{1}{2}}x_n \rightarrow H_a^{\frac{1}{2}}{\widetilde{x}}\). Consequently, \({\text {dom}}(H^{\frac{1}{2}})\subset {\text {dom}}(H_a^{\frac{1}{2}})\) and the inequality (7.10) extends to all \(x \in {\text {dom}}(H^{\frac{1}{2}})\), which proves that \(H_a\preceq H\), and the proof of statement (1) is complete.

The proof of statement (2) requires drawing on results for the causal dual system \(\Sigma ^d\) as well as results for \(\Sigma \) itself. We note from (7.5) that \({\text {ran}}( {{\mathbf {W}}}_o^*) = {\text {ran}}( {{\mathbf {W}}}_c^d)\). Note also by Proposition 6.3 that H is a solution of the spatial KYP-inequality (1.13) for \(\Sigma \) if and only if \(H^{-1}\) is a solution of the spatial KYP-inequality (6.5) for \(\Sigma ^d\). Thus \({\text {ran}}({{\mathbf {W}}}_o^*)\) being a core for \(H^{-\frac{1}{2}}\) where H solves the KYP-inequality (1.13) for \(\Sigma \) is the same as \({\text {ran}}({{\mathbf {W}}}_c^d)\) being a core for \((H^{-1})^{\frac{1}{2}}\) where \(H^{-1}\) solves the KYP-inequality (6.5) for \(\Sigma ^d\). We conclude that the hypothesis for statement (2) in the theorem is the same as the hypothesis for statement (1), but applied to \(\Sigma ^d\) rather than to \(\Sigma \). Hence, if we assume the hypothesis for statement (2), we can use the implication in statement (1) already proved to conclude that \(H^d_a \preceq H^{-1}\), where (7.7) gives \(H_a^d = H_r^{-1}\), and thus we have \(H_r^{-1} \preceq H^{-1}\). Now [1, Proposition 3.4] gives us the desired inequality \(H \preceq H_r\). \(\square \)

Remark 7.5

Theorem 7.4 states that \(H_a\) and \(H_r\) are both positive definite spatial solutions to the KYP inequality (1.13), provided \(\Sigma \) is a minimal well-posed system which has transfer function in \({{\mathcal {S}}}_{U,Y}\), and they are the minimal and maximal spatial solutions at least within a certain subset of the collection of spatial solutions. To be precise, if \({{\mathcal {G}}}{{\mathcal {K}}}_\Sigma \) denotes the collection of all positive definite spatial solutions to (1.13), then \(H_a\) is the minimal element in

$$\begin{aligned} \widetilde{{{\mathcal {G}}}{{\mathcal {K}}}}_{\Sigma ,\text{ core }}:=\{H\in {{\mathcal {G}}}{{\mathcal {K}}}_\Sigma \mid {\text {ran}}({\mathbf {W}}_c) \text{ is } \text{ a } \text{ core } \text{ for } H^\frac{1}{2} \} \end{aligned}$$

while \(H_r\) is the maximal element in

$$\begin{aligned} \widehat{{{\mathcal {G}}}{{\mathcal {K}}}}_{\Sigma ,\text{ core }}:=\{H\in {{\mathcal {G}}}{{\mathcal {K}}}_\Sigma \mid {\text {ran}}({\mathbf {W}}_o^*) \text{ is } \text{ a } \text{ core } \text{ for } H^\frac{1}{2} \}. \end{aligned}$$

That we cannot claim that \(H_a\) is the minimal element in \({{\mathcal {G}}}{{\mathcal {K}}}_\Sigma \), despite the fact that \(S_a\) is the minimal storage function for \(\Sigma \), is because in general we only managed to prove that \(S_a\) and \(S_{H_a}\) coincide on \({\text {ran}}({\mathbf {W}}_c)\).

In [5] another analysis of the spatial solutions to the KYP for well-posed linear systems is obtained, with somewhat different extremality results. This may result from the fact that the analysis conducted in [5] is done at the level of system nodes, and that the requirements there are slightly different. More precisely, in [5] it is not assumed that the well-posed system \(\Sigma \) is minimal, but rather, for spatial solutions H it is assumed, in addition, that the well-posed system \(\Sigma _H\) obtained by applying \(H^\frac{1}{2}\) as a pseudo-similarity is minimal, and in that case the minimal and maximal solutions are those that correspond to the so-called optimal and \(*\)-optimal solutions. Note that because of the applied pseudo-similarity, the KYP-inequality for \(\Sigma _H\) always has a bounded and boundedly invertible solution, namely \(1_X\). Why there are no core restrictions in [5], which correspond to those that we have in the present paper, is unclear to us at this stage.

If in addition to the minimality and a Schur class transfer function we also have \(L^2\)-controllability or \(L^2\)-observability, more can be said about the operators \(H_a\) and \(H_r\).

Corollary 7.6

Let \(\Sigma \) be a minimal well-posed system which has transfer function in \({{\mathcal {S}}}_{U,Y}\). Then the following holds:

  1. (1)

    If \(\Sigma \) is \(L^2\)-controllable, then \(H_a\) and \(H_r\) are bounded.

  2. (2)

    If \(\Sigma \) is \(L^2\)-observable, then \(H_a^{-1}\) and \(H_r^{-1}\) are bounded.

Proof

Assume that \(\Sigma \) is \(L^2\)-controllable, that is, \({\text {dom}}({{\mathbf {W}}}_c^\bigstar )\) is dense and \({\text {ran}}({{\mathbf {W}}}_c)=X\). Since \(X={\text {ran}}({{\mathbf {W}}}_c)\) is contained in the domains of \(H_a^\frac{1}{2}\) and \(H_r^\frac{1}{2}\) by Theorem 7.2, it follows that \(H_a^\frac{1}{2}\) and \(H_r^\frac{1}{2}\) are bounded by the closed graph theorem; hence \(H_a\) and \(H_r\) are also bounded. Statement 2 follows by applying statement 1 to \(\Sigma ^d\). \(\square \)

8 Proofs of the Bounded Real Lemmas

In this section we prove the bounded real lemmas posed in the introduction. We start with a proof of Theorem 1.9.

Proof of Theorem 1.9

The implication (5) \(\Rightarrow \) (4) is trivial and many of the other implications have been proved in the preceding sections: that (4) \(\Rightarrow \) (1) follows from Proposition 6.1; the equivalence (2) \(\Leftrightarrow \) (5) follows from Proposition 6.2, together with the statement that the same H works in both items; Theorem 7.4 shows that (1) \(\Rightarrow \) (2). Hence it follows that (1) \(\Leftrightarrow \) (2) \(\Leftrightarrow \) (4) \(\Leftrightarrow \) (5).

Next, we show that (3) \(\Rightarrow \) (5). Assume that item (3) holds, say that \(\Gamma :X\supset {\text {dom}}(\Gamma )\rightarrow X^\circ \) implements a pseudo-similarity from \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) to a passive well-posed system \(\Sigma ^\circ =\left[ {\begin{matrix}{\mathfrak {A}}^\circ &{}{\mathfrak {B}}^\circ \\ {\mathfrak {C}}^\circ &{}{\mathfrak {D}}^\circ \end{matrix}}\right] \) with state space \(X^\circ \). In that case \(H:=\Gamma ^*\Gamma \) and its positive semidefinite square root are well-defined positive definite operators, and \({\text {dom}}(H^{\frac{1}{2}})={\text {dom}}(\Gamma )\) by [27, §VIII.9]. We next prove that \(S_H\) in (6.2) is a quadratic storage function for \(\Sigma \). For this, pick \(z_0\in {\text {dom}}(H^{\frac{1}{2}})\) arbitrarily and let \(({{\mathbf {u}}},{{\mathbf {z}}},{{\mathbf {y}}})\) be a trajectory of \(\Sigma \) on \({{{\mathbb {R}}}^{+}}\) with initial state \({{\mathbf {x}}}(0)=z_0\). Setting \({{\mathbf {x}}}(t):=\Gamma {{\mathbf {z}}}(t)\), \(t\geqslant 0\), and \(x_0:=\Gamma z_0\) we get that \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a trajectory of \(\Sigma ^\circ \) on \({{{\mathbb {R}}}^{+}}\) with initial state \(x_0\), since

$$\begin{aligned} {{\mathbf {x}}}(t)= \Gamma {{\mathbf {z}}}(t)=\Gamma ({\mathfrak {A}}^tz_0+{\mathfrak {B}}^t{{\mathbf {u}}})= {\mathfrak {A}}^{\circ t}\Gamma {{\mathbf {z}}}(0)+{\mathfrak {B}}^{\circ t}{{\mathbf {u}}}, \end{aligned}$$

and

$$\begin{aligned} {\mathfrak {C}}^\circ \Gamma {{\mathbf {z}}}(0)+{\mathfrak {D}}{{\mathbf {u}}}={\mathfrak {C}}x_0+{\mathfrak {D}}{{\mathbf {u}}}={{\mathbf {y}}}. \end{aligned}$$

By passivity, every trajectory \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma ^\circ \) on \({{{\mathbb {R}}}^{+}}\) satisfies (6.1) with \(S(x_0)=\Vert x_0\Vert ^2_{X^\circ }\), and by considering \({{\mathbf {x}}}(t)=\Gamma {{\mathbf {z}}}(t)\), we see that also (6.1) holds with S replaced by \(S_H\) and \({{\mathbf {x}}}\) replaced by \({{\mathbf {z}}}\). If \(z_0\not \in {\text {dom}}(H^{\frac{1}{2}})\) then \(S_H(z_0)=\infty \), and the modification of (6.1) is still true. We have proved that \(S_H\) is a quadratic storage function for \(\Sigma \), where \(H=\Gamma ^*\Gamma \).

Finally, we prove that (1) \(\Rightarrow \) (3). Assume the transfer function \(\widehat{{\mathfrak {D}}}\) of \(\Sigma \) is in \({{\mathcal {S}}}_{U,Y}\), more precisely, that it has an analytic continuation to a function in \({{\mathcal {S}}}_{U,Y}\). In that case \(\widehat{{\mathfrak {D}}}\) coincides with the transfer function of some minimal passive well-posed system on some right half-plane, by Theorem 11.8.14 in [31]. Hence we have two minimal well-posed systems whose transfer functions coincide on some right half-plane, of which one is passive. Then Theorem 9.2.4 in [31] (see also [5, Theorem 4.11]) implies that the two systems are pseudo-similar. In particular, \(\Sigma \) is pseudo-similar to a passive well-posed system. \(\square \)

Next we turn to the proof of Theorem 1.10.

Proof of Theorem 1.10

By Corollary 3.8, the \(L^2\)-minimality of \(\Sigma \) implies that \(\Sigma \) is minimal. Assume item (3) holds, i.e., \(\Sigma \) is similar to a passive system. Then, in particular, \(\Sigma \) is pseudo-similar to a passive system, and since \(\Sigma \) is minimal we can conclude from the implication (3) \(\Rightarrow \) (1) in Theorem 1.9 that the transfer function \({\widehat{{\mathfrak {D}}}}\) is in \({{\mathcal {S}}}_{U,Y}\). Hence item (1) holds.

Next we show that (2) \(\Rightarrow \) (3). Assume that the operator H on X is a bounded, strictly positive definite solution to the KYP inequality (1.14). In that case \(\Gamma :=H^{\frac{1}{2}}\) can serve as a similarity to a passive system. Indeed, for each \(t\geqslant 0\), set

$$\begin{aligned} \begin{bmatrix}{\mathfrak {A}}^{\circ t}&{}{\mathfrak {B}}^{\circ t} \\ {\mathfrak {C}}^{\circ t}&{} {\mathfrak {D}}^{\circ t}\end{bmatrix}:= \begin{bmatrix}H^\frac{1}{2} &{}0\\ 0&{}I\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t\end{bmatrix} \begin{bmatrix}H^{-\frac{1}{2}} &{}0\\ 0&{}I\end{bmatrix}. \end{aligned}$$

Then we have

$$\begin{aligned} H^{\frac{1}{2}}{\mathfrak {A}}^t={\mathfrak {A}}^{\circ t} H^{\frac{1}{2}},\quad H^{\frac{1}{2}}{\mathfrak {B}}^t={\mathfrak {B}}^{\circ t},\quad {\mathfrak {C}}^t={\mathfrak {C}}^{\circ t} H^{\frac{1}{2}},\quad {\mathfrak {D}}^t={\mathfrak {D}}^{\circ t}. \end{aligned}$$
(8.1)

Furthermore, (1.14) implies that \(\left[ {\begin{matrix}{\mathfrak {A}}^{\circ t}&{}{\mathfrak {B}}^{\circ t} \\ {\mathfrak {C}}^{\circ t}&{} {\mathfrak {D}}^{\circ t} \end{matrix}}\right] \) is contractive for each \(t\geqslant 0\). Clearly, the relation between \({\mathfrak {A}}^t\) and \({\mathfrak {A}}^{\circ t}\) in (8.1) with \(H^{\frac{1}{2}}\) bounded and boundedly invertible implies that \({\mathfrak {A}}^{\circ t}\) inherits the properties of a \(C_0\)-semigroup from \({\mathfrak {A}}^t\). Next, define \({\mathfrak {B}}^\circ \), \({\mathfrak {C}}^\circ \) and \({\mathfrak {D}}^\circ \) via the limits in (2.2), adding \(\circ \) where appropriate. It is then easy to check that (8.1) extends to

$$\begin{aligned} H^{\frac{1}{2}}{\mathfrak {A}}^t={\mathfrak {A}}^{\circ t} H^{\frac{1}{2}},\quad H^{\frac{1}{2}}{\mathfrak {B}}={\mathfrak {B}}^{\circ },\quad {\mathfrak {C}}={\mathfrak {C}}^{\circ } H^{\frac{1}{2}},\quad {\mathfrak {D}}={\mathfrak {D}}^{\circ }, \end{aligned}$$

and via these relations it follows that the requirements on the \(C_0\)-semigroup \({\mathfrak {A}}^\circ \) and the operators \({\mathfrak {B}}^\circ \), \({\mathfrak {C}}^\circ \) and \({\mathfrak {D}}^\circ \) to form a well-posed system (Definition 2.1) carry over from \({\mathfrak {A}}\), \({\mathfrak {B}}\), \({\mathfrak {C}}\) and \({\mathfrak {D}}\). We have proved that \(\left[ {\begin{matrix}{\mathfrak {A}}^{\circ }&{}{\mathfrak {B}}^{\circ } \\ {\mathfrak {C}}^{\circ }&{} {\mathfrak {D}}^{\circ } \end{matrix}}\right] \) is a passive system that is similar to \(\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) via the similarity \(\Gamma =H^{\frac{1}{2}}\); hence item (3) holds.

To establish the mutual equivalence of all three items, it remains to prove that (1) \(\Rightarrow \) (2). Hence assume that \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}\). Since \(\Sigma \) is minimal, Theorem 7.4 gives that \(H_a\) and \(H_r\) are spatial solutions to the KYP-inequality (1.13). However, the \(L^2\)-minimality of \(\Sigma \) implies that \(H_a\) and \(H_r\) are bounded and boundedly invertible, by Corollary 7.6. Thus \(H_a\) and \(H_r\) are bounded, positive definite operators on X with bounded inverses, and hence both are bounded and strictly positive definite. Since \(H_a\) and \(H_r\) are bounded solutions to the spatial KYP inequality (1.13), it is immediate that \(H_a\) and \(H_r\) also satisfy the standard KYP inequality (1.14). Hence statement (2) holds.

Next we prove that \({{{\mathbb {C}}}^{+}}\subset {\text {dom}}({\widehat{{\mathfrak {D}}}})\) if there is some bounded and boundedly invertible \(\Gamma \) that implements the similarity from \(\Sigma \) to a passive system \(\Sigma ^\circ \). Assume this and recall that by Proposition 2.3, \( {\text {dom}}({\widehat{{\mathfrak {D}}}})={{\mathbb {C}}}_{\omega _{{\mathfrak {A}}}}\). Since \({\mathfrak {A}}^\circ \) is a contraction semigroup, as implied by passivity, we get from (2.5) that

$$\begin{aligned} \omega _{{\mathfrak {A}}}=\lim _{t\rightarrow \infty }\frac{\ln \Vert {\mathfrak {A}}^{t}\Vert }{t} \leqslant \lim _{t\rightarrow \infty }\frac{\ln \Vert \Gamma ^{-1}\Vert +\ln \Vert {\mathfrak {A}}^{\circ t}\Vert +\ln \Vert \Gamma \Vert }{t}=\omega _{{\mathfrak {A}}^\circ }\leqslant 0. \end{aligned}$$

We established above that every bounded, strictly positive definite solution H to the KYP inequality provides a similarity via \(H^{\frac{1}{2}}\). The converse implication follows from the final statement in Theorem 1.9.

We already noted that \(H_a\) and \(H_r\) are both bounded and strictly positive definite, and that \(\Sigma \) is approximately controllable, so that \({\text {ran}}({\mathfrak {B}})\) is dense in X. By Theorem 6.2, every solution H to the spatial KYP inequality (1.13) defines a storage function \(S_H\), which by Theorem 6.5 is wedged between \(S_a\) and \(S_r\): \(S_a(x)\leqslant S_H(x)\leqslant S_r(x)\) for all \(x\in X\). Moreover, combining item (3) in Proposition 3.2 with (6.11) and (6.13), we get that \(S_r(x)={\underline{S}}_r(x)\) for all \(x\in {\text {ran}}({\mathfrak {B}})\subset {\text {ran}}({{\mathbf {W}}}_c)\). Then (7.4) gives

$$\begin{aligned} \Vert H_a^{\frac{1}{2}} x\Vert \leqslant \Vert H^{\frac{1}{2}} x\Vert \leqslant \Vert H_r^{\frac{1}{2}} x\Vert ,\quad x\in {\text {ran}}({\mathfrak {B}}). \end{aligned}$$

Since \({\text {ran}}({\mathfrak {B}})\) is dense in X, these inequalities in fact hold on all of X, and we get that H inherits boundedness from \(H_r\), while strict positive definiteness carries over to H from \(H_a\). Hence every generalized solution H to the spatial KYP is also a bounded, strictly positive definite solution to the standard KYP inequality (1.14), and \(H_a \preceq H \preceq H_r\) holds. \(\square \)

In case the transfer function is a strict Schur class function and \({\mathfrak {A}}\) is exponentially stable, to obtain a bounded, strictly positive definite solution H to the standard KYP inequality (1.14), it suffices to have only \(L^2\)-controllability or \(L^2\)-observability:

Proposition 8.1

Let \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{}{\mathfrak {D}} \end{matrix}}\right] \) be a minimal, exponentially stable well-posed system with transfer function \({\widehat{{\mathfrak {D}}}}\) in the strict Schur class \({{\mathcal {S}}}_{U,Y}^0\). Then \(H_a\) and \(H_r^{-1}\) are bounded and are given by

$$\begin{aligned} H_a={\mathbf {W}}_o^* D_{{\mathfrak {T}}_\Sigma ^*}^{-2} {\mathbf {W}}_o\quad \text{ and }\quad H_r^{-1}={\mathbf {W}}_c D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }^{-2} {\mathbf {W}}_c^*. \end{aligned}$$
(8.2)

Furthermore, \(H_a^{-1}\) is bounded if and only if \(\Sigma \) is \(L^2\)-observable and \(H_r\) is bounded if and only if \(\Sigma \) is \(L^2\)-controllable.

Proof

By Lemma 3.6, the exponential stability guarantees that the operators \({\mathbf {W}}_o\) and \({\mathbf {W}}_c\) are bounded. Moreover, because \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}_{U,Y}^0\), \(D_{{\mathfrak {T}}_\Sigma ^*}\) and \(D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }\) are boundedly invertible. It follows that the operators \({{\mathbf {X}}}_a\) and \({{\mathbf {X}}}_r\) in Lemma 7.1 are given by

$$\begin{aligned} {{\mathbf {X}}}_a= D_{{\mathfrak {T}}_\Sigma ^*}^{-1} {\mathbf {W}}_o|_{{\text {ran}}({\mathbf {W}}_c)}\quad \text{ and }\quad {{\mathbf {X}}}_r= D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }^{-1} {\mathbf {W}}_c^*|_{{\text {ran}}({\mathbf {W}}_o^*)}, \end{aligned}$$

and hence they extend uniquely to bounded operators \(\overline{{{\mathbf {X}}}}_a=D_{{\mathfrak {T}}_\Sigma ^*}^{-1} {\mathbf {W}}_o\) and \(\overline{{{\mathbf {X}}}}_r=D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }^{-1} {\mathbf {W}}_c^*\) from X into \(L^{2+}_Y\) and \(L^{-2}_U\), respectively. The boundedness of and formulas for \(H_a=\overline{{{\mathbf {X}}}}_a^*\overline{{{\mathbf {X}}}}_a\) and \(H_r^{-1}=\overline{{{\mathbf {X}}}}_r^*\overline{{{\mathbf {X}}}}_r\) now follow directly. Moreover, given the boundedness of \({\mathbf {W}}_o\) and \({\mathbf {W}}_c\) we have that \(L^2\)-observability and \(L^2\)-controllability are equivalent to \({\mathbf {W}}_o\) and \({\mathbf {W}}_c^*\) being bounded below, respectively, from which the last claim follows. \(\square \)

Using Proposition 8.1, we can obtain explicitly the extremal KYP solutions \(H_a\) and \(H_r\) arising from the minimal realization for the strict Schur-class transfer function (5.7) which was already discussed in Example 5.5, thereby illustrating Proposition 8.1 and item (5) of Theorem 1.12.

Example 8.2

In Example 5.5, we considered the diagonal system \(\Sigma \) with operators

$$\begin{aligned} {\mathfrak {A}}^t\phi _n=e^{-(n+1)t}\phi _n,\quad B\phi _n=\frac{\sqrt{n+1}}{2}\phi _n,\quad C\phi _n=2\sqrt{n+1}\phi _n,\qquad n=0,1,\ldots , \end{aligned}$$

leading to \({{\mathbf {W}}}_o\) determined by

$$\begin{aligned} ({{\mathbf {W}}}_o\phi _n)(t)=2\sqrt{n+1}e^{-(n+1)t}\phi _n, \quad t\geqslant 0, \end{aligned}$$
(8.3)

being bounded from both below and above.

In order to apply the formula for \(H_a\) in (8.2), we additionally need some information on the action of the adjoint of \({\mathfrak {T}}_\Sigma \) in (6.9). Combining the latter with item (1) of Theorem 3.4, we get \({\mathfrak {T}}_\Sigma ={\mathfrak {D}}\big |_{L^{2+}_U}\), and we next compute this operator using (4.6). Because of (4.2), and item (3) of Proposition 3.2,

$$\begin{aligned} {{\mathbf {W}}}_c \big (f(\cdot )\phi _n\big )= \frac{1}{2}\sqrt{n+1}\int _{-\infty }^0e^{(n+1)s}f(s)\,{\mathrm {d}}s\,\phi _n, \quad f\in L^{2-}, \end{aligned}$$
(8.4)

and \({\mathfrak {B}}={{\mathbf {W}}}_c\big |_{L^{2-}_{\ell ,U}}\). Combining the above with (5.6) and \({\widehat{{\mathfrak {D}}}}(0)=\frac{1}{2}1_U\) gives for all \(f\in L^{2}_{\ell ,loc,{{\mathbb {C}}}}\) and \(n=1,2,\ldots \) that

$$\begin{aligned} {\mathfrak {D}}\big (f(\cdot )\phi _n\big ) = t\mapsto (n+1)\,e^{-(n+1)t}\int _{-\infty }^te^{(n+1)s}f(s)\,{\mathrm {d}}s\,\phi _n - \frac{1}{2}f(t)\phi _n, \quad t\in {{\mathbb {R}}}. \end{aligned}$$
(8.5)

For all \(n=0,1,\ldots \) and \(u=\sum _{m=1}^\infty f_m\phi _m\), \(f_m\in L^{2+}_{{\mathbb {C}}}\), it then holds that

$$\begin{aligned} \begin{aligned}&\left\langle {\mathfrak {T}}_\Sigma ^*\pi _+e_{-(n+1)}\phi _n , u \right\rangle _{L^{2+}_U} \\&\quad = \left\langle \pi _+e_{-(n+1)}\phi _n , {\mathfrak {T}}_\Sigma \!\sum _{m=1}^\infty f_m(\cdot ) \phi _m \right\rangle _{L^{2+}_U}\\&\quad =\int _0^\infty e^{-(n+1)t}\,\overline{\phi _n^*\big ({\mathfrak {D}}f_n(\cdot )\phi _n\big )(t)}\,{\mathrm {d}}t \\&\quad =\int _0^\infty e^{-(n+1)t}\left( \overline{(n+1)e^{-(n+1)t}\int _0^te^{(n+1)s}f_n(s)\,{\mathrm {d}}s -\frac{1}{2} f_n(t)}\right) \,{\mathrm {d}}t \\&\quad =\int _0^\infty (n+1)e^{(n+1)s}\overline{f_n(s)}\int _s^\infty e^{-2(n+1)t}\,{\mathrm {d}}t \,{\mathrm {d}}s - \frac{1}{2}\int _0^\infty e^{-(n+1)t} \overline{f_n(t)}\,{\mathrm {d}}t=0. \end{aligned} \end{aligned}$$

Hence, \({\mathfrak {T}}_\Sigma ^*\pi _+e_{-(n+1)}\phi _n=0\) for \(n=0,1,\ldots \), which implies that

$$\begin{aligned} D_{{\mathfrak {T}}_\Sigma ^*}^{2}(\pi _+e_{-(n+1)}\phi _n)=\pi _+e_{-(n+1)}\phi _n= D_{{\mathfrak {T}}_\Sigma ^*}^{-2}(\pi _+e_{-(n+1)}\phi _n). \end{aligned}$$

Using (8.2) and (8.3), we then easily calculate

$$\begin{aligned} H_a\phi _n={\mathbf {W}}_o^*{\mathbf {W}}_o\phi _n=4(n+1)\int _0^\infty e^{-2(n+1)t}\,{\mathrm {d}}t\,\phi _n=2\,\phi _n, \end{aligned}$$

i.e., that \(H_a=2\cdot 1_U\).

Now proceeding to \(H_r\), we get from (8.4) that

$$\begin{aligned} {{\mathbf {W}}}_c^*\phi _n=\frac{\sqrt{n+1}}{2}\pi _-e_{n+1}\phi _n, \end{aligned}$$

and we need to evaluate \(D_{{\widetilde{{\mathfrak {T}}}}_\Sigma }^{-2}\) on this. By item (1) of Theorem 3.4 and (8.5),

$$\begin{aligned} (L_\Sigma \pi _-e_{n+1}\phi _n)(t)= (n+1)e^{-(n+1)t}\int _{-\infty }^te^{2(n+1)s}\,{\mathrm {d}}s\,\phi _n -\frac{e^{(n+1)t}}{2}\phi _n=0,\quad t\leqslant 0, \end{aligned}$$

so that \(D^{-2}_{{\widetilde{{\mathfrak {T}}}}_\Sigma }\pi _-e_{n+1}\phi _n=\pi _-e_{n+1}\phi _n\). Then (8.2) and (8.4) give

$$\begin{aligned} H_r\phi _n = ({{\mathbf {W}}}_c{{\mathbf {W}}}_c^*)^{-1}\phi _n =8\,\phi _n. \end{aligned}$$

Finally, by Theorem 1.10, all solutions H to the spatial KYP inequality for \(\Sigma \) are bounded and strictly positive definite; in fact they satisfy \(2\cdot 1_X\preceq H\preceq 8\cdot 1_X\).

We now turn to the proof of the strict bounded real lemma, stated as Theorem 1.12.

Proof of (2a) \(\Rightarrow \) (2b), (3a) \(\Rightarrow \) (3b), (4a) \(\Rightarrow \) (4b), (5a) \(\Rightarrow \) (5b). Note that these are tautologies following from the definitions. \(\square \)

Proof of (2a) \(\Leftrightarrow \) (3a) \(\Leftrightarrow \) (4a) \(\Rightarrow \) (5a). Let us assume (2a). Thus there is a bounded, strictly positive definite H on X satisfying (1.16) for some \(\delta > 0\). As we saw in the proof of (2) \(\Rightarrow \) (3) in Theorem 1.10, \(\Gamma := H^{\frac{1}{2}}\) is an invertible change of state-space coordinates \(x^\circ := \Gamma x\) transforming the well-posed linear system \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) to the system

$$\begin{aligned} \Sigma ^\circ = \begin{bmatrix} {\mathfrak {A}}^\circ &{} {\mathfrak {B}}^\circ \\ {\mathfrak {C}}^\circ &{} {\mathfrak {D}}^\circ \end{bmatrix} := \begin{bmatrix} \Gamma {\mathfrak {A}}\Gamma ^{-1} &{} \Gamma {\mathfrak {B}}\\ {\mathfrak {C}}\Gamma ^{-1} &{} {\mathfrak {D}}\end{bmatrix}, \end{aligned}$$

and moreover, for each \(t > 0\), the map

$$\begin{aligned} \Sigma ^{\circ t} = \begin{bmatrix} {\mathfrak {A}}^{\circ t} &{} {\mathfrak {B}}^{\circ t} \\ {\mathfrak {C}}^{ \circ t} &{} {\mathfrak {D}}^{\circ t} \end{bmatrix} :\begin{bmatrix} x^\circ (0) \\ u^\circ |_{[0,t]} \end{bmatrix} \mapsto \begin{bmatrix} x^\circ (t) \\ y^\circ |_{[0,t]} \end{bmatrix} \end{aligned}$$

has the same form when considered as a transformation of \(\Sigma ^t = \left[ {\begin{matrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{matrix}}\right] \):

$$\begin{aligned} \Sigma ^{\circ t} = \begin{bmatrix} \Gamma {\mathfrak {A}}^t \Gamma ^{-1} &{} \Gamma {\mathfrak {B}}^t \\ {\mathfrak {C}}^t \Gamma ^{-1} &{} {\mathfrak {D}}^t \end{bmatrix}. \end{aligned}$$

Note that the inequality (1.16) can be interpreted as the statement that the system trajectories \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) of \(\Sigma \) satisfy

$$\begin{aligned} \begin{aligned}&\Vert \Gamma {{\mathbf {x}}}(t) \Vert ^2 + \Vert {{\mathbf {y}}}|_{[0,t]} \Vert ^2_{L^2([0,t];Y)} +\delta \Vert {{\mathbf {x}}}|_{[0,t]} \Vert ^2_{L^2([0,t];X)} \\&\quad \le \Vert \Gamma {{\mathbf {x}}}(0) \Vert ^2 + (1 - \delta ) \Vert {{\mathbf {u}}}|_{[0,t]} \Vert ^2_{L^2([0,t]; U)},\qquad t>0. \end{aligned} \end{aligned}$$
(8.6)

Using that \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) is a system trajectory for \(\Sigma \) if and only if \(({{\mathbf {u}}}^\circ , {{\mathbf {x}}}^\circ , {{\mathbf {y}}}^\circ ) = ({{\mathbf {u}}},\Gamma {{\mathbf {x}}}, {{\mathbf {y}}})\) is a system trajectory for \(\Sigma ^\circ \) and the simple estimate \(\Vert \Gamma x\Vert \leqslant \Vert \Gamma \Vert \cdot \Vert x\Vert \), we get from (8.6) that

$$\begin{aligned} \begin{aligned}&\Vert {{\mathbf {x}}}^\circ (t) \Vert ^2 + \Vert {{\mathbf {y}}}|_{[0,t]} \Vert ^2_{L^2([0,t];Y)} +\delta ' \Vert {{\mathbf {x}}}^\circ |_{[0,t]} \Vert ^2_{L^2([0,t];X)} \\&\quad \le \Vert {{\mathbf {x}}}^\circ (0) \Vert ^2 + \left( 1 - \delta \right) \Vert {{\mathbf {u}}}|_{[0,t]} \Vert ^2_{L^2([0,t]; U)}\,,\qquad t>0, \end{aligned} \end{aligned}$$
(8.7)

where \(\delta ':=\min (\delta ,\delta /\Vert \Gamma \Vert ^2)>0\). In (8.7), we can still replace \(\delta \) by \(\delta '\leqslant \delta \), and the result then translates back to (1.16) holding for the system \(\Sigma ^\circ \) with \(H = 1_X\) and \(\delta \) replaced by \(\delta '>0\), and (3a) is established.

Conversely, assume (3a), so that \(\Sigma \) is similar to a strictly passive system \(\Sigma ^\circ \) via an invertible \(\Gamma :X \mapsto X^\circ \), and let \(({{\mathbf {u}}},{{\mathbf {x}}},{{\mathbf {y}}})\) be a system trajectory of \(\Sigma \). Then \(({{\mathbf {u}}}^\circ , {{\mathbf {x}}}^\circ , {{\mathbf {y}}}^\circ ) = ({{\mathbf {u}}}, \Gamma {{\mathbf {x}}}, {{\mathbf {y}}})\) is a system trajectory of \(\Sigma ^\circ \) such that (8.7) holds for some \(\delta =\delta '>0\). Setting \(H = \Gamma ^* \Gamma \succ 0\) and observing that \(\Vert x\Vert /\Vert \Gamma ^{-1}\Vert \leqslant \Vert \Gamma x\Vert \), we obtain from (8.7), with \(\delta '':=\min (\delta ,\delta /\Vert \Gamma ^{-1}\Vert ^2)>0\), that

$$\begin{aligned}&\Vert H^{\frac{1}{2}} {{\mathbf {x}}}(t) \Vert ^2 + \Vert {{\mathbf {y}}}|_{[0,t]} \Vert ^2_{L^2([0,t]),Y)} +\delta '' \Vert {{\mathbf {x}}}|_{[0,t]} \Vert ^2_{L^2([0,t];X)} \nonumber \\&\quad \le \Vert H^{\frac{1}{2}} {{\mathbf {x}}}(t) \Vert ^2 + \left( 1 - \delta ''\right) \Vert {{\mathbf {u}}}|_{[0,t]} \Vert ^2_{L^2([0,t],U)}. \end{aligned}$$

This in turn is equivalent to H being a bounded, strictly positive-definite solution to (1.16) with \(\delta \) replaced by \(\delta ''>0\). Hence (2a) \(\Leftrightarrow \) (3a).

Next note that (2a) \(\Leftrightarrow \) (4a) follows from the discussion in Remark 1.13. Finally (4a) \(\Rightarrow \) (5a) is a tautology. \(\square \)

Proof of (2b) \(\Leftrightarrow \) (3b) \(\Leftrightarrow \) (4b) \(\Rightarrow \) (5b). (2b) \(\Leftrightarrow \) (3b) is a simpler version of the above proof of (2a) \(\Leftrightarrow \) (3a), where one works with (1.17) in place of (1.16) and the manipulations of \(\delta \) associated to the now absent term \(\Vert {{\mathbf {x}}}|_{[0,t]} \Vert ^2_{L^2([0,t];X)}\) are not needed. The equivalence of (2b) and (4b) is again a consequence of the observations in Remark 1.13. Finally, (4b) \(\Rightarrow \) (5b) is a tautology. \(\square \)

Proof of (5b) \(\Rightarrow \) (1). Assume that \(\Sigma \) satisfies condition (5b), so that \(\Sigma \) has a semi-strict storage function S satisfying (1.9), repeated here (in the case \(t_1 = 0\), \(t_2 = t\)) for the reader’s convenience: There is a \(\delta > 0\) such that

$$\begin{aligned} S({{\mathbf {x}}}(t)) + \int _0^t \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \le S({{\mathbf {x}}}(0)) + (1 - \delta ) \int _0^t \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s,\quad t\geqslant 0, \end{aligned}$$

for all trajectories \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) of \(\Sigma \) on \({{{\mathbb {R}}}^{+}}\). As S(x) (and hence \(S({{\mathbf {x}}}(t))\)) has values in \([0, \infty ]\), we certainly then also have

$$\begin{aligned} \int _0^t \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \le S({{\mathbf {x}}}(t)) + \int _0^t \Vert {{\mathbf {y}}}(s) \Vert ^2 \,{\mathrm {d}}s \le S({{\mathbf {x}}}(0)) + (1 - \delta ) \int _0^t \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s \end{aligned}$$

for all such system trajectories \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) and \(t\geqslant 0\). In particular, let us consider only those system trajectories initialized to satisfy \({{\mathbf {x}}}(0) = 0\). Then using that storage functions by definition satisfy \(S(0) = 0\) and ignoring the middle in the preceding chain of inequalities, we see that

$$\begin{aligned} \int _0^t \Vert {{\mathbf {y}}}(s) \Vert ^2\,{\mathrm {d}}s \le (1 - \delta ) \int _0^t \Vert {{\mathbf {u}}}(s) \Vert ^2 \,{\mathrm {d}}s,\quad t>0. \end{aligned}$$

Letting t tend to \(+\infty \) then gives us

$$\begin{aligned} \Vert {{\mathbf {y}}}\Vert ^2_{L^2({{\mathbb {R}}}^+, Y)} \le (1 - \delta ) \Vert {{\mathbf {u}}}\Vert ^2_{L^2({{\mathbb {R}}}^+,U)}. \end{aligned}$$

Applying the Plancherel Theorem and taking Laplace transforms then gives us

$$\begin{aligned} \Vert {\widehat{{{\mathbf {y}}}}} \Vert ^2_{H^2({{\mathbb {R}}}^+, Y)} \le (1 - \delta ) \Vert {\widehat{{{\mathbf {u}}}}} \Vert ^2_{L^2({{\mathbb {R}}}^+, U)}, \end{aligned}$$

where, as noted in (2.9), \({\widehat{{{\mathbf {y}}}}} = M_{{\widehat{{\mathfrak {D}}}}} {\widehat{{{\mathbf {u}}}}}\); see also (3.7). Hence \(\Vert M_{{\widehat{{\mathfrak {D}}}}} \Vert \le \sqrt{1 - \delta }\) and therefore

$$\begin{aligned} \Vert {\widehat{{\mathfrak {D}}}}\Vert _{H^\infty ({{\mathbb {C}}}^+, {{\mathcal {B}}}(U,Y))} = \Vert M_{{\widehat{{\mathfrak {D}}}}} \Vert \le \sqrt{1 - \delta } < 1, \end{aligned}$$

i.e., \({\widehat{{\mathfrak {D}}}}\) is in the strict Schur class with \({{{\mathbb {C}}}^{+}}\subset {\text {dom}}({\widehat{{\mathfrak {D}}}})\), and we have arrived at statement (1) as wanted. \(\square \)

Next we work towards a proof of the remainder of Theorem 1.12, namely that the implication (1) \(\Rightarrow \) (2a) holds under the additional hypothesis that \({\mathfrak {A}}\) is exponentially stable and that at least one of the hypotheses (H1), (H2), (H3) holds. The tool for this analysis is to dilate \(\Sigma \) into a well-posed system \(\Sigma _\varepsilon \) for which there exists a bounded and boundedly invertible solution H to the KYP-inequality for \(\Sigma _\varepsilon \); this H then turns out to be a bounded and boundedly invertible solution of the strict KYP-inequality for the original well-posed system \(\Sigma \). The details are as follows.

The first step is to embed the system node \({{\mathbf {S}}}\) of \(\Sigma \) into a larger system node \({{\mathbf {S}}}_\varepsilon \) via a procedure which we call \(\varepsilon \)-regularization. We extend the operators \(B \in {{\mathcal {B}}}(U, X_{-1})\) and \(C \in {{\mathcal {B}}}(X_1, Y)\) to operators \( B_\varepsilon = \begin{bmatrix} B&\varepsilon 1_X \end{bmatrix} \in {{\mathcal {B}}}(\left[ {\begin{matrix} U \\ X \end{matrix}}\right] , X_{-1})\) and \(C_\varepsilon = \left[ {\begin{matrix} C \\ \varepsilon 1_X \\ 0 \end{matrix}}\right] \in {{\mathcal {B}}}(X_1, \left[ {\begin{matrix}Y \\ X \\ U \end{matrix}}\right] )\). Using the operators \(B_\varepsilon \) and A we define \( \begin{bmatrix}{A \& B}\end{bmatrix}_\varepsilon \) with domain

$$ \begin{aligned} {\text {dom}}(\begin{bmatrix}{A \& B}\end{bmatrix}_\varepsilon ) \!:=\!\! \left\{ \begin{bmatrix}x\\ u\\ u_1\end{bmatrix}\in \begin{bmatrix}X\\ U\\ X\end{bmatrix} \biggm \vert A_{-1}x+B_\varepsilon \begin{bmatrix}u\\ u_1\end{bmatrix}\in X \right\} \!=\! \begin{bmatrix}{\text {dom}}({A \& B})\\ X\end{bmatrix}, \end{aligned}$$

and action given by

$$ \begin{aligned} \begin{bmatrix}{A \& B}\end{bmatrix}_\varepsilon :=\begin{bmatrix}A_{-1}&B_\varepsilon \end{bmatrix}\Big |_{{\text {dom}}(\begin{bmatrix}{A \& B}\end{bmatrix}_\varepsilon )}=\begin{bmatrix}{A \& B}&\varepsilon 1_X\end{bmatrix}. \end{aligned}$$

Next we define \( \begin{bmatrix}{C \& D}\end{bmatrix}_\varepsilon \) on \( {\text {dom}}(\begin{bmatrix}{C \& D}\end{bmatrix}_\varepsilon )={\text {dom}}(\begin{bmatrix}{A \& B}\end{bmatrix}_\varepsilon )\) by

$$ \begin{aligned} \begin{bmatrix}{C \& D}\end{bmatrix}_\varepsilon \begin{bmatrix}x\\ u\\ u_1\end{bmatrix}:= & {} C_\varepsilon \left( x-(\alpha -A_{-1})^{-1}B_\varepsilon \begin{bmatrix}u\\ u_1\end{bmatrix}\right) \nonumber \\&+\begin{bmatrix}{\widehat{{\mathfrak {D}}}}(\alpha )&{}\varepsilon \,C(\alpha -A)^{-1} \\ \varepsilon \, (\alpha -A_{-1})^{-1}B&{}\varepsilon ^2(\alpha -A)^{-1}\\ \varepsilon 1_U&{}0\end{bmatrix}\begin{bmatrix}u\\ u_1\end{bmatrix}, \end{aligned}$$
(8.8)

where \(\alpha \in \rho (A)\) is the same number \(\alpha \) as used in the definition of \( C \& D\) via formula (4.3) as part of the definition of \( \left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] \), and where \({\widehat{{\mathfrak {D}}}}(\alpha )\) is the value at \(\alpha \) of the transfer function \({\widehat{{\mathfrak {D}}}}\) for the original well-posed system \(\Sigma \). It is now an easy exercise to verify that \( {{\mathbf {S}}}_\varepsilon : = \left[ {\begin{matrix} (A \& B)_\varepsilon \\ (C \& D)_\varepsilon \end{matrix}}\right] \) is a system node in the sense of Definition 4.1.

Our next goal is to apply Theorem 4.3 to show that \({{\mathbf {S}}}_\varepsilon \) is the system node arising from a well-posed linear system \(\Sigma _\varepsilon \). Note that Theorem 4.3 calls for a choice of \(\omega \in {{\mathbb {R}}}\) with \(\omega _{\mathfrak {A}}< \omega \). Here we shall be assuming that \({\mathfrak {A}}\) is exponentially stable, i.e., that \(\omega _{\mathfrak {A}}< 0\). Hence we have the option (which we shall use) of taking \(\omega = 0\) in the application of Theorem 4.3. For this case it is customary to simplify the terminology 0-bounded (i.e., \(\rho \)-bounded for the case \(\rho = 0\)) to simply bounded. Thus \({\mathfrak {B}}\), \({\mathfrak {C}}\), \({\mathfrak {D}}\) being bounded means that the operators \({\widetilde{{\mathfrak {B}}}}\), \({\widetilde{{\mathfrak {C}}}}\), \({\widetilde{{\mathfrak {D}}}}\) appearing in (2.7) satisfy

$$\begin{aligned} {\widetilde{{\mathfrak {B}}}} \in {{\mathcal {B}}}(L^{2-}_U, X), \quad {\widetilde{{\mathfrak {C}}}} \in {{\mathcal {B}}}(X, L^{2+}_Y), \quad {\widetilde{{\mathfrak {D}}}} \in {{\mathcal {B}}}(L^2_U, L^2_Y). \end{aligned}$$

The following lemma encodes the main properties of the \(\varepsilon \)-regularized system node \({{\mathbf {S}}}_\varepsilon \). In particular we see that we view the \(\varepsilon \)-regularization process as producing a dilation at three levels:

  • at the system node level: \({{\mathbf {S}}}_\varepsilon \) can be seen as a dilation of \({{\mathbf {S}}}\);

  • at the transfer-function level: \({\widehat{{\mathfrak {D}}}}_\varepsilon \) can be seen as a dilation of \({\widehat{{\mathfrak {D}}}}\);

  • at the well-posed level: \(\left[ {\begin{matrix} {\mathfrak {A}}_\epsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t \\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t \end{matrix}}\right] \) can be seen as a dilation of \(\left[ {\begin{matrix} {\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{} {\mathfrak {D}}^t \end{matrix}}\right] \).

Lemma 8.3

Assume that \(\Sigma =\left[ {\begin{matrix}{\mathfrak {A}}&{}{\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) is an exponentially stable well-posed system with associated system node \( {{\mathbf {S}}}=\left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] \) with a strict Schur class transfer function \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}^0_{U,Y}\). Then, for all \(\varepsilon >0\), the operator

$$ \begin{aligned} {{\mathbf {S}}}_\varepsilon =\left[ {\begin{matrix}{A \& B}\\ {C \& D} \end{matrix}}\right] _\varepsilon :=\left[ {\begin{matrix}\left[ {\begin{matrix}{A \& B} \end{matrix}}\right] _\varepsilon \\ \left[ {\begin{matrix}{C \& D} \end{matrix}}\right] _\varepsilon \end{matrix}}\right] \end{aligned}$$

constructed above is the system node of a minimal, exponentially stable, bounded, well-posed system \(\Sigma _\varepsilon \) with transfer function \({\widehat{{\mathfrak {D}}}}_\varepsilon \) given by

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}_\varepsilon (\lambda )=\begin{bmatrix}{\widehat{{\mathfrak {D}}}}(\lambda )&{} \varepsilon \,C(\lambda -A)^{-1}\\ \varepsilon \,(\lambda -A_{-1})^{-1}B&{}\varepsilon ^2(\lambda -A)^{-1}\\ \varepsilon 1_U&{}0\end{bmatrix}, \qquad \lambda \in \rho (A). \end{aligned}$$
(8.9)

For \(\varepsilon >0\) sufficiently small, \({\widehat{{\mathfrak {D}}}}_\varepsilon \) is also in the strict Schur class over \({{{\mathbb {C}}}^{+}}\).

For each \(t\geqslant 0\) the t-dependent operators \(\left[ {\begin{matrix}{\mathfrak {A}}_\varepsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t\\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t \end{matrix}}\right] \) for the well-posed system \(\Sigma _\varepsilon \) have the form

$$\begin{aligned} \begin{bmatrix}{\mathfrak {A}}_\varepsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t\\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t\end{bmatrix} = \begin{bmatrix}{\mathfrak {A}}^t &{} {\mathfrak {B}}^t &{} {\mathfrak {B}}_1^t\\ {\mathfrak {C}}^t &{}{\mathfrak {D}}^t&{} {\mathfrak {D}}_1^t\\ {\mathfrak {C}}_1^t &{}{\mathfrak {D}}_2^t&{} {\mathfrak {D}}_3^t\\ 0 &{}\varepsilon 1_{L^2([0,t],U)} &{} 0\end{bmatrix}: \begin{bmatrix}X\\ L^2([0,t],U) \\ L^2(([0,t],X)\end{bmatrix} \rightarrow \begin{bmatrix}X\\ L^2([0,t],Y) \\ L^2([0,t],X) \\ L^2([0,t],U)\end{bmatrix}, \end{aligned}$$
(8.10)

with \({\mathfrak {A}}^t\), \({\mathfrak {B}}^t\), \({\mathfrak {C}}^t\) and \({\mathfrak {D}}^t\) equal to the t-dependent operators determined by the original system \(\Sigma \) and \({\mathfrak {B}}_1^t\), \({\mathfrak {C}}_1^t\), \({\mathfrak {D}}_1^t\), \({\mathfrak {D}}_2^t\) and \({\mathfrak {D}}_3^t\) some operators acting between appropriate spaces.

If \(\Sigma \) is \(L^2\)-controllable (\(L^2\)-observable), then also \(\Sigma _\varepsilon \) is \(L^2\)-controllable (\(L^2\)-observable).

Proof

We already left as an exercise for the reader to check that \({{\mathbf {S}}}_\varepsilon \) is a system node. In order to prove that \({{\mathbf {S}}}_\varepsilon \) is the system node of a well-posed system \(\Sigma _\varepsilon \), we prove that conditions (1)–(3) of Theorem 4.3 are satisfied.

First we verify that \(B_\varepsilon \) is an admissible control operator for A. For all \(\left[ {\begin{matrix}{{\mathbf {u}}}\\ {{\mathbf {u}}}_1 \end{matrix}}\right] \in L^{2-}_{\ell ,U\times X}\), the formula for \(B_\varepsilon \) gives

$$\begin{aligned} {\mathfrak {B}}_\varepsilon \begin{bmatrix}{{\mathbf {u}}}\\ {{\mathbf {u}}}_1\end{bmatrix}= \int _{-\infty }^0 {\mathfrak {A}}_{-1}^{-s}B{{\mathbf {u}}}(s)\,{\mathrm {d}}s+ \varepsilon \int _{-\infty }^0 {\mathfrak {A}}^{-s}{{\mathbf {u}}}_1(s)\,{\mathrm {d}}s\in X. \end{aligned}$$
(8.11)

The first term lands in X since B is admissible for A. The second term lands in X by the compact support of \({{\mathbf {u}}}_1\) and the uniform boundedness of \({\mathfrak {A}}\) on compact intervals. Thus \(B_\varepsilon \) is an admissible control operator for A. We next observe that \(C_\varepsilon \) is an admissible observation operator for A, i.e., that

$$\begin{aligned} x\mapsto \left( \begin{bmatrix}C\\ \varepsilon 1_X\\ 0\end{bmatrix}{\mathfrak {A}}^tx\right) _{t\geqslant 0},\quad x\in {\text {dom}}(A), \end{aligned}$$

can be extended to a continuous linear operator from X to \(L^{2+}_{loc,Y\times X\times U}\); indeed, C is admissible for A and from \(\omega _{\mathfrak {A}}<0\), we get

$$\begin{aligned} \int _0^T \Vert \varepsilon {\mathfrak {A}}^tx\Vert ^2\,{\mathrm {d}}t\leqslant -\frac{2M^2\varepsilon ^2}{\omega _{\mathfrak {A}}}\Vert x\Vert ^2. \end{aligned}$$

This completes the verification of conditions (1) and (2) in Theorem 4.3.

In order to verify condition (3), we first prove formula (8.9) for the transfer function \({\widehat{{\mathfrak {D}}}}_\varepsilon \) of the system node \({{\mathbf {S}}}_\varepsilon \). To this end, we use formulas (4.4) and (8.8) to compute:

$$ \begin{aligned} {\widehat{{\mathfrak {D}}}}_\varepsilon (\lambda )&= \begin{bmatrix}{C \& D}\end{bmatrix}_\varepsilon \begin{bmatrix}(\lambda -A_{-1})^{-1}B_\varepsilon \\ \begin{bmatrix}1_U&{}0\\ 0&{}1_X\end{bmatrix}\end{bmatrix} \\&=\begin{bmatrix} C \\ \varepsilon 1_X \\ 0 \end{bmatrix} \big ( (\lambda - A_{-1})^{-1} - (\alpha - A_{-1})^{-1}\big ) \begin{bmatrix}B&\varepsilon 1_X\end{bmatrix} \\&\qquad +\begin{bmatrix} {\widehat{{\mathfrak {D}}}}(\alpha ) &{} \varepsilon C (\alpha - A)^{-1} \\ \varepsilon (\alpha - A_{-1})^{-1} B &{} \varepsilon ^2 (\alpha - A)^{-1} \\ \varepsilon 1_U &{} 0 \end{bmatrix},\quad \lambda \in \rho (A), \end{aligned}$$

and observing that the (1, 1) entry equals \( {C \& D}\left[ {\begin{matrix}(\lambda -A_{-1})^{-1}B\\ 1_U \end{matrix}}\right] \), we get (8.9). To verify condition (3) in Theorem 4.3 applied to \({{\mathbf {S}}}_\varepsilon \), we need to verify that each block entry appearing in the formula (8.9) for \({\widehat{{\mathfrak {D}}}}_\varepsilon \) is in \(H^\infty ({{{\mathbb {C}}}^{+}};{{\mathcal {B}}}(K,L))\) for the relevant \(K,L = X,U,Y\) as appropriate. Since the original system \(\Sigma \) is well-posed with \(\omega _{\mathfrak {A}}< 0\), we can apply [31, Lemma 10.3.3] (with parameter \(\omega \) taken to be \(\omega = 0\)) to conclude that

$$\begin{aligned} \lambda \mapsto (\lambda - A)^{-1} , \quad \lambda \mapsto (\lambda - A_{-1})^{-1} B ,\quad \lambda \mapsto C (\lambda - A)^{-1} ,\quad \lambda \in {{{\mathbb {C}}}^{+}}, \end{aligned}$$

are all in \(H^\infty \) over \({{{\mathbb {C}}}^{+}}\) as wanted. With these observations in hand, it then becomes clear that choosing \(\varepsilon > 0\) sufficiently small implies that \({\widehat{{\mathfrak {D}}}}_\varepsilon \) is in the strict Schur class too. Moreover, it now follows from Theorem 4.3 that \({{\mathbf {S}}}_\varepsilon \) is the system node of a bounded, well-posed system \(\Sigma _\varepsilon \), which is exponentially stable, since the \(C_0\)-semigroup is the same as that of the original system \(\Sigma \). The formula (8.10) for \(\left[ {\begin{matrix} {\mathfrak {A}}_\varepsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t \\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t \end{matrix}}\right] \) is a straightforward consequence of the construction.

We next discuss minimality. Fixing any \(x\in X\) perpendicular to \({\text {ran}}({\mathfrak {B}}_\varepsilon )\), we get from (8.11) that for all \({{\mathbf {u}}}_1\in L^{2-}_{\ell ,X}\):

$$\begin{aligned} 0=\left\langle x , \varepsilon \int _{-\infty }^0 {\mathfrak {A}}^{-s}{{\mathbf {u}}}_1(s)\,{\mathrm {d}}s \right\rangle _X =\varepsilon \left\langle s\mapsto {\mathfrak {A}}^{-s*}x , {{\mathbf {u}}}_1 \right\rangle _{L^{2-}_X}. \end{aligned}$$
(8.12)

By the density of \(L^{2-}_{\ell ,X}\) in \(L^{2-}_{X}\), the continuous function \(s\mapsto {\mathfrak {A}}^{-s*}x\) must vanish on \((-\infty ,0)\), and letting \(s\rightarrow 0^-\), we get that \(x=0\), i.e., that \(\Sigma _\varepsilon \) is (approximately) controllable. Since \(C^*_\varepsilon =\begin{bmatrix}C^*&\varepsilon 1_X&0\end{bmatrix}\), (8.12) gives that \(\Sigma _\varepsilon ^*\) is controllable, i.e., \(\Sigma _\varepsilon \) is observable; hence \(\Sigma _\varepsilon \) is minimal. As \({\mathfrak {A}}\) is exponentially stable by assumption, it follows that \({{\mathbf {W}}}_{c,\varepsilon }\) is bounded by Lemma 3.6, and hence it follows from (8.11) that \({{\mathbf {W}}}_{c,\varepsilon }=\begin{bmatrix}{{\mathbf {W}}}_c&{{\mathbf {W}}}_\varepsilon \end{bmatrix}\) for some bounded operator \({{\mathbf {W}}}_\varepsilon :L^{2-}_X\rightarrow X\); now it is trivial from Definition 3.7 that \(\Sigma _\varepsilon \) inherits \(L^2\)-controllability from \(\Sigma \). By (3.5), the bounded \(L^2\)-controllability map of \(\Sigma ^d_\varepsilon \) is , and so \(\Sigma _\varepsilon ^*\) is \(L^2\)-controllable, i.e., \(\Sigma _\varepsilon \) is \(L^2\)-observable, whenever \(\Sigma \) is \(L^2\)-observable. \(\square \)

Now we can prove the last part of Theorem 1.12.

Proof of (1) \(\Rightarrow \) (2a) in Theorem 1.12. To complete the proof of Theorem 1.12 it remains to show that (1) \(\Rightarrow \) (2a) holds under the assumption that \({\mathfrak {A}}\) is exponentially stable and that at least one of the additional conditions (H1), (H2) or (H3) holds. Assume \({\widehat{{\mathfrak {D}}}}\in {{\mathcal {S}}}^0_{U,Y}\). Let \(\Sigma _\varepsilon \) be the \(\varepsilon \)-regularized system constructed above, where we take \(\varepsilon >0\) small enough, so that the transfer function \({\widehat{{\mathfrak {D}}}}_\varepsilon \) of \(\Sigma _\varepsilon \) is still a strict Schur class function.

We claim that each of the conditions (H1), (H2) and (H3) implies that the standard KYP-inequality for \(\Sigma _\varepsilon \) has a bounded, strictly positive definite solution H. Assuming (H1), note that clearly the operators \(B_\varepsilon \) and \(C_\varepsilon \) satisfy the conditions of Proposition 5.3, so that item (3) of Proposition 5.3 implies that \(\Sigma _\varepsilon \) is \(L^2\)-minimal. Then the \(L^2\)-minimal standard bounded real lemma, Theorem 1.10, shows that the standard KYP-inequality for \(\Sigma _\varepsilon \) has a bounded, strictly positive definite solution \(H_\varepsilon \). In fact, both the operators \(H_{\varepsilon ,a}\) and \(H_{\varepsilon ,r}\) associated with the available storage and required supply of \(\Sigma _\varepsilon \) are bounded and strictly positive definite.

For (H2) and (H3), note that \(\Sigma _\varepsilon \) is minimal and exponentially stable. Therefore, by Proposition 8.1, \(H_{\varepsilon ,a}\) and \(H_{\varepsilon ,r}^{-1}\) are bounded and their inverses are bounded precisely when \(\Sigma _\varepsilon \) is \(L^2\)-observable and \(L^2\)-controllable, respectively. Since, by Lemma 8.3, \(L^2\)-observability of \(\Sigma \) implies \(L^2\)-observability of \(\Sigma _\varepsilon \), and likewise for \(L^2\)-controllability, it follows that \(H_{\varepsilon ,a}\) is a bounded, strictly positive definite solution to the KYP inequality for \(\Sigma _\varepsilon \) whenever (H3) holds, while (H2) implies that \(H_{\varepsilon ,r}\) is a bounded, strictly positive definite solution to the KYP inequality for \(\Sigma _\varepsilon \).

Hence, assuming (H1), (H2) or (H3) as well as the exponential stability, we obtain a bounded, strictly positive definite solution H to the standard KYP inequality for \(\Sigma _\varepsilon \). Our next goal is to show that this H is also a solution to the strict KYP inequality (1.16) for the original system \(\Sigma \), and thereby arrive at (2a) and complete the proof of (1) (and extra hypotheses) \(\Rightarrow \) (2a). We first need to probe a little deeper into the structure of \(\Sigma _\varepsilon \).

We shall have need for more explicit formulas for the operators \({\mathfrak {C}}^t_1\) and \({\mathfrak {D}}^t_2\) appearing in (8.10). It is easy to see from the definition of \(C_\varepsilon \) that \({\mathfrak {C}}^t_1 :X \rightarrow L^2([0,t],X)\) is given by

$$\begin{aligned} {\mathfrak {C}}^t_1 :x_0 \mapsto \bigg ( s \mapsto \varepsilon \,{\mathfrak {A}}^s x_0 \bigg )_{0 \le s \le t}. \end{aligned}$$

As for \({\mathfrak {D}}^t_2\), what we know from (8.9) is that

$$\begin{aligned} {{\mathcal {L}}}{\mathfrak {D}}_2 {{\mathcal {L}}}^{-1} = M_{{\widehat{{\mathfrak {D}}}}_2} \end{aligned}$$

where \({{\mathcal {L}}}\) is the bilateral Laplace transform, and where by (8.9) we know that

$$\begin{aligned} {\widehat{{\mathfrak {D}}}}_2(\lambda ) = \varepsilon \,(\lambda - A_{-1})^{-1} B,\quad \lambda \in \rho (A). \end{aligned}$$
(8.13)

In general for a well-posed linear system \(\Sigma = \left[ {\begin{matrix} {\mathfrak {A}}&{} {\mathfrak {B}}\\ {\mathfrak {C}}&{} {\mathfrak {D}} \end{matrix}}\right] \) it is difficult to compute the input-output map \({\mathfrak {D}}\) explicitly from the transfer function \({\widehat{{\mathfrak {D}}}}(\lambda )\). However for the case here, where \({\widehat{{\mathfrak {D}}}}_2\) is a simple expression in terms of the resolvent of the semigroup generator A, from experience with the reverse direction of computing the frequency-domain transfer function from the time-domain system equations, we conjecture that

$$\begin{aligned} {\mathfrak {D}}^t_2 :{{\mathbf {u}}}|_{[0,t]} \mapsto \bigg ( s \mapsto \varepsilon \int _0^s {\mathfrak {A}}^{s - r}_{-1} B {{\mathbf {u}}}(r) \,{\mathrm {d}}r \bigg )_{0 \le s \le t}; \end{aligned}$$

indeed this is correct, because it agrees with the observation that (8.13) is the transfer function for the special case \( {C \& D}= \left[ {\begin{matrix}\varepsilon 1_X&0 \end{matrix}}\right] \), followed by application of (4.6) for this special \( {C \& D}\).

We conclude that if \(({{\mathbf {u}}}, {{\mathbf {x}}}, {{\mathbf {y}}})\) is a system trajectory on \({{\mathbb {R}}}^+\) with \({{\mathbf {x}}}(0) = x_0\), then

$$\begin{aligned} \begin{bmatrix} {\mathfrak {C}}^t_1&{\mathfrak {D}}^t_2 \end{bmatrix} :\begin{bmatrix} x_0 \\ {{\mathbf {u}}}|_{[0,t]} \end{bmatrix} \mapsto \bigg ( s \mapsto \varepsilon \,{{\mathbf {x}}}(s) \bigg )_{0 \le s \le t} = \varepsilon \begin{bmatrix} {\mathfrak {C}}^t_{1_X,A}&{\mathfrak {D}}^t_{A,B} \end{bmatrix}, \end{aligned}$$

where the right hand side is defined in (1.15).

Let us now suppose that H is bounded strictly positive-definite solution of the standard KYP-inequality associated with the \(\varepsilon \)-regularized well-posed system \(\Sigma _\varepsilon \). Then H satisfies

$$\begin{aligned} \begin{bmatrix}{\mathfrak {A}}_\varepsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t\\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t\end{bmatrix}^* \begin{bmatrix}H&{} 0\\ 0 &{} 1_{L^2([0,t], \left[ {\begin{matrix}Y \\ X \\ U \end{matrix}}\right] )}\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}_\varepsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t\\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t\end{bmatrix} \preceq \begin{bmatrix}H&{} 0\\ 0 &{} 1_{L^2([0,t], \left[ {\begin{matrix}U \\ X \end{matrix}}\right] )}\end{bmatrix}. \end{aligned}$$

Compressing this inequality to \(X \oplus L^2([0,t],U)\) and writing out \(\left[ {\begin{matrix}{\mathfrak {A}}_\varepsilon ^t &{} {\mathfrak {B}}_\varepsilon ^t\\ {\mathfrak {C}}_\varepsilon ^t &{} {\mathfrak {D}}_\varepsilon ^t \end{matrix}}\right] \) yields

$$\begin{aligned}&\begin{bmatrix}H&{} 0\\ 0 &{} 1_{L^2([0,t],U)}\end{bmatrix} \\&\quad \succeq \begin{bmatrix}{\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{}{\mathfrak {D}}^t\\ \varepsilon {\mathfrak {C}}_{1_X,A}^t &{}\varepsilon {\mathfrak {D}}_{A,B}^t\\ 0 &{}\varepsilon 1_{L^2([0,t],U)}\end{bmatrix}^* \begin{bmatrix}H&{}0\\ 0&{} 1_{L^2([0,t], \left[ {\begin{matrix}Y \\ X \\ U \end{matrix}}\right] )}\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{}{\mathfrak {D}}^t\\ \varepsilon {\mathfrak {C}}_{1_X,A}^t &{} \varepsilon {\mathfrak {D}}_{A,B}^t\\ 0 &{}\varepsilon 1_{L^2([0,t],U)}\end{bmatrix}\\&\quad = \begin{bmatrix}{\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{}{\mathfrak {D}}^t\end{bmatrix}^* \begin{bmatrix}H&{} 0\\ 0 &{} 1_{L^2([0,t],Y)}\end{bmatrix} \begin{bmatrix}{\mathfrak {A}}^t &{} {\mathfrak {B}}^t \\ {\mathfrak {C}}^t &{}{\mathfrak {D}}^t\end{bmatrix} + \varepsilon ^2 \begin{bmatrix}{\mathfrak {C}}_{1_X,A}^{t *}\\ {\mathfrak {D}}_{A,B}^{t*}\end{bmatrix} \begin{bmatrix}{\mathfrak {C}}_{1_X,A}^t&{\mathfrak {D}}_{A,B}^t\end{bmatrix} + \\&\qquad + \begin{bmatrix}0&{}0\\ 0&{}\varepsilon ^2 1_{L^2([0,t],U)}\end{bmatrix}. \end{aligned}$$

Subtracting \(\left[ {\begin{matrix}0&{}0\\ 0&{}\varepsilon ^2 1_{L^2([0,t],U)} \end{matrix}}\right] \) from both sides gives (1.16) with \(\delta =\varepsilon ^2>0\) and this completes the proof. \(\square \)