We assume the reader to have familiarity with linear time-invariant (LTI) systems. In this chapter we merely summarise the main results of this theory. We are going to call the quantities that are considered the input, the output and some characterization of the system signals. This should evoke a meaningful interpretation in most of the systems that we are going to discuss. Mathematically they are distributions.

1 Basic Definitions

The meaning of time-invariant is very intuitive: suppose that we apply the input signal x(t) to a system represented by an operator \({\mathcal {H}}\) and observe the signal

$$\begin{aligned} y(t) = {\mathcal {H}}[x(t)] \end{aligned}$$

as its output (Fig. 8.1). The system is said to be time-invariantif by applying the delayed input signal \(x(t - \tau )\) we observe the same output signal as before, except for a delay in time by an amount \(\tau \), that is, if

$$\begin{aligned} y(t - \tau ) = {\mathcal {H}}[x(t - \tau )]\,. \end{aligned}$$
(8.1)

The concept of linearity is subtler. A defining property of a linear system is the validity of the superposition principle: if \(y_1(t)\) is the response of the system to the input \(x_1(t)\) and \(y_2(t)\) the one to \(x_2(t)\), then the response to a linear combination of these inputs is

$$\begin{aligned} y(t) = & {} {\mathcal {H}}[c_1 x_1(t) + c_2 x_2(t)] = c_1 {\mathcal {H}}[x_1(t)] + c_2 {\mathcal {H}}[x_2(t)] \nonumber \\ = & {} c_1 y_1(t) + c_2 y_2(t) \end{aligned}$$
(8.2)

with \(c_1\) and \(c_2\) constants. However, if we limit the definition of a linear system to this property, then we admit pathological systems as the following one.

Example 8.1: A Discontinuous System [22]

Consider a system accepting as input a piece-wise continuous function with at most a finite number of isolated jump discontinuities. The system response consists in the sum of the input signal jumps from \(-\infty \) to the present time t.

The system satisfies (8.2). However, the behaviour is rather peculiar. If we apply, say, a rectangular input then the output is also rectangular. But, if we approximate to any degree of accuracy the rectangular input with a continuous function, then the output is always zero.

To exclude systems with such a bizarre behavior, we require linear systems to be continuous: if as \(m \in {\mathbb {N}}\) tends to \(\infty \) the sequence of input signals \(x_m(t)\) converges (in the sense of distributions) to the signal x(t), then the system response \(y_m(t)\) corresponding to input \(x_m(t)\) converges to the response y(t) corresponding to x(t).

Fig. 8.1
figure 1

Representation of a single-input single-output LTI system \({\mathcal {H}}\)

Suppose that we apply an impulse \(\delta (t)\) to the input of the system \({\mathcal {H}}\) and observe the signal h(t) at its output. Then, by linearity, if we apply a finite number of pulses the output must be

$$\begin{aligned} {\mathcal {H}}[\sum _{j=1}^n a_{j} \,\delta (t - \tau _{j})] = \sum _{j=1}^n a_{j} \,h(t - \tau _{j}) = h(t) *\sum _{j=1}^n a_{j} \,\delta (t - \tau _{j})\,. \end{aligned}$$

In Sect. 3.3 we saw that every distribution can be represented as the limit of a finite series of Dirac impulses. From this and the linearity of convolution (Eq. (3.19)) we obtain that, in the limit as n tends to infinity, if the input converges to the signal x(t) the output of the system converges to

$$\begin{aligned} y(t) = h(t) *x(t)\,. \end{aligned}$$

We therefore define

Definition 8.1

(LTI System) A single-input, single-output (SISO), linear time-invariant (LTI) system is a system that, when driven by an input signal x(t) produces the output

$$\begin{aligned} y(t) = h(t) *x(t) \end{aligned}$$
(8.3)

with h(t) the impulse response of the system.

A system is called realif, when driven by a real distribution, its response is a real distribution. In other words, if its impulse response is a real distribution.

While we have been talking about signals depending on time, we can abstract from that and talk about signals depending on a generic n dimensional independent variable \(\lambda \in {\mathbb {R}}^n\). In this case, instead of time-invariance, it makes more sense to adapt (8.1) to

$$\begin{aligned} y(\lambda - \tau ) = {\mathcal {H}}[x(\lambda - \tau )] \end{aligned}$$

and talk about translation invariance. A single-input single-output, linear translation-invariant system is then still described by a convolution product similar to (8.3) where however the independent variable t is replaced by the abstract n dimensional variable \(\lambda \). We are going to call a system of this type an LTI system as well.

2 Causality

Assume for simplicity that h and x are integrable functions of time. The response of a system characterized by h when driven by the input x can then be written in integral form

$$\begin{aligned} y(t) = \int \limits _{-\infty }^\infty h(\tau ) \, x(t - \tau ) \, d\tau = \int \limits _{-\infty }^\infty h(t - \tau ) \, x(\tau ) \, d\tau \,. \end{aligned}$$

Suppose now that the input vanishes for \(t<0\). Then from

$$\begin{aligned} y(t) = \int \limits _{0}^\infty h(t - \tau ) \, x(\tau ) \, d\tau \end{aligned}$$

we see that in general the system may produce a nonzero response y(t) for \(t<0\), that is, before the input signal x(t) has been applied.

If a system is causal, that is, if its output at time \(t_0\) can only depend on values of the input signal at times \(t \le t_0\), then its impulse response h(t) must vanish for \(t <0\). In other words h must be a right-sided distribution in \({\mathcal {D_+'}}\).

Note that in our interpretation of signals as being functions of time, non-causal systems are not physically implementable and appear to be meaning-less. However, non-causal systems are sometimes useful in theoretical studies. In addition, in many situations the theory of LTI systems can be applied to systems where the quantities of interest (the input and output) are not functions of time (see Example 7.6).

3 Stability

An important aspect of a system is its stability. Let x(t) be a bounded function, that is, satisfying 

$$\begin{aligned} \Vert x\Vert _\infty :=\sup _{t\in {\mathbb {R}}}|x(t) | < \infty \,. \end{aligned}$$

The response of a system characterized by the impulse response h(t) to such an input signal is

$$\begin{aligned} y(t) = h(t) *x(t)\,. \end{aligned}$$

The output y(t) is well-defined if

$$\begin{aligned} \langle h *x,\phi \rangle < \infty \end{aligned}$$

for every test function \(\phi \in {\mathcal {D}}\) and for every sequence \((\phi _m)\) converging to zero in \({\mathcal {D}}\)

$$\begin{aligned} \lim _{m\rightarrow \infty }\langle h *x,\phi _m \rangle = 0\,. \end{aligned}$$

In this case we say that the system is bounded-input bounded-output (BIBO) stable.

For a system to be BIBO stable

$$\begin{aligned} \langle h(t) *x(t),\phi (t) \rangle = \langle h(t),\int _{{\mathbb {R}}}\,x(\tau ) \phi (t+\tau )\,d\tau \rangle \end{aligned}$$

must have a meaning. Observe that the inner integral is an indefinitely differentiable bounded function. For the convolution to have a meaning the impulse response of the system must therefore be extensible to a continuous linear form on \({\mathcal {B}}\). As we saw in Sect. 6.1 this is only the case if h is a summable distribution. Thus, for a system to be BIBO stable, its impulse response must be a summable distribution.

We mention without going into details that the definition of a BIBO stable system can be extended to input signals that are so-called bounded distributions and usually denoted by \({\mathcal {B}}'\) or \({\mathcal {D}}'_{L^\infty }\) [16].

The series connection, or cascade of two stable systems results in a stable system. This is so because the convolution of summable distributions is always well-defined and is itself a summable distribution. In addition, for linear systems the order of the connection is irrelevant as, if \(h_A\) and \(h_B\) are the impulse responses of the two systems

$$\begin{aligned} h_A *h_B = h_B *h_A\,. \end{aligned}$$

4 Transfer Function

4.1 Stable Systems

If a system is stable then its impulse response h can be Fourier transformed and the transformed \(\hat{h}\) is a continuous function of slow growth called the frequency responseof the system. If the input signal x is also a summable distribution then it can also be Fourier transformed and the Fourier transform of the output signal can be represented by the product

$$\begin{aligned} \hat{y}(\omega ) = \hat{h}(\omega ) \hat{x}(\omega )\,. \end{aligned}$$
(8.4)

If the input signal x is \({\mathcal {T}}\)-periodic, then the system can be analysed in the convolution algebra of periodic distributions. To do so the impulse response h is converted in a periodic distribution by convolving it with the unit of the convolution algebra of periodic distributions \(\delta _{\mathcal {T}}\)

$$\begin{aligned} h_{\mathcal {T}}:=h *\delta _{\mathcal {T}}\,. \end{aligned}$$

Provided that \(h_{\mathcal {T}}\) is well-defined, which for stable systems is always the case, then the output of the system can be represented by

$$\begin{aligned} y = h_{\mathcal {T}}*x\,. \end{aligned}$$

Note that while the convolution used to define \(h_{\mathcal {T}}\) is the convolution in \({\mathcal {D'}}({\mathbb {R}})\), the latter is the convolution in \({\mathcal {D'}}({\mathbb {T}})\). As discussed in Sect. 7.5, the equation is most conveniently solved with the help of the Fourier series. If we denote by \(c_m(y), c_m(h_{\mathcal {T}})\) and \(c_m(x)\) the mth Fourier coefficient of \(y, h_{\mathcal {T}}\) and x respectively, then the equation is solved if

$$\begin{aligned} c_m(y) = {\mathcal {T}}c_m(h_{\mathcal {T}}) c_m(x) \end{aligned}$$

for every \(m\in {\mathbb {Z}}\). From (4.24) we know that

$$\begin{aligned} c_m(h_{\mathcal {T}}) = \frac{\hat{h}(m\omega _c)}{{\mathcal {T}}} \end{aligned}$$

with \(\omega _c=2\pi /{\mathcal {T}}\). Therefore, by knowing the Fourier transform of the impulse response we can immediately obtain the Fourier coefficients of the output signal by

$$\begin{aligned} c_m(y) = \hat{h}(m\omega _c) c_m(x)\,. \end{aligned}$$
(8.5)

In particular, if the input is the complex tone \(\mathrm{{e}}^{\jmath \omega _ct}\), the output is also a complex tone at the exact same frequency

$$\begin{aligned} y(t) = \hat{h}(\omega _c) \mathrm{{e}}^{\jmath \omega _ct}\,. \end{aligned}$$

If the input of the system is the sum of two (or more) periodic signals \(x_A\) and \(x_B\) with incommensurate frequencies \(\omega _A\) and \(\omega _B\), that is, if the ratio of the two frequencies \(\omega _A/\omega _B\) is an irrational number, then the input signal is not periodic, but almost periodic. Due to the linearity and continuity of the system, the response can still be calculated by the above technique for each input separately and the result combined

$$\begin{aligned} y(t) = \sum _{m=-\infty }^\infty \hat{h}(m\omega _A)c_m(x_A)\mathrm{{e}}^{\jmath m\omega _A t} + \hat{h}(m\omega _B)c_m(x_B)\mathrm{{e}}^{\jmath m\omega _B t}\,. \end{aligned}$$

4.2 Causal Systems

If the system is causal, that is, if its impulse response h is a distribution in \({\mathcal {D_+'}}\), and one is interested in the system response for right-sided input signals \(x\in {\mathcal {D_+'}}\), then the system response y can be calculated in the convolution algebra \({\mathcal {D_+'}}\). In particular, if h and x are Laplace transformable then the Laplace transformed of the output signal can be calculated by

$$\begin{aligned} Y(s) = H(s) X(s)\,. \end{aligned}$$
(8.6)

The Laplace transformed H(s) of the impulse response h is called the system transfer function.

If the system is BIBO stable, then the ROC of H(s) includes the imaginary axis \(s=j\omega \). In this case the Fourier transformed of h is immediately obtained from the transfer function by

$$\begin{aligned} \hat{h}(\omega ) = H(\jmath \omega )\,. \end{aligned}$$
(8.7)

Note that if the system is not BIBO stable then this relation is not valid even if the Fourier transform of h does exits. See Example 5.4 for a simple example where the system corresponds to an ideal integrator.

In the following we are going to denote distributions belonging to \({\mathcal {D_+'}}\cap {\mathcal {D}}_{L^1}'\) by \({\mathcal {D}}_{L^1+}'\).

5 Rational Transfer Functions

Consider a causal system described by a rational transfer function

$$\begin{aligned} H(s) = \frac{N(s)}{P(s)} = \frac{b_ns^n + b_{n-1}s^{n-1} + \cdots + b_0}{s^m + a_{m-1}s^{m-1} + \cdots + a_0}\,. \end{aligned}$$

Given the Laplace transform X(s) of the input signal x, the Laplace transformed of the output is

$$\begin{aligned} Y(s) = \frac{N(s)}{P(s)} X(s)\,. \end{aligned}$$

If we multiply both sides of this equation by P(s) we obtain

$$\begin{aligned} P(s) Y(s) = N(s) X(s) \end{aligned}$$

and by inverse Laplace transforming the equation we obtain the convolution equation

$$\begin{aligned} &\left( D^n\delta + a_{n-1}D^{n-1}\delta + \cdots + a_0\delta \right) *y \\ & \quad = \left( b_mD^m\delta + b_{m-1}D^{m-1}\delta + \cdots + b_0\delta \right) *x\,. \end{aligned}$$

With the results of Sect. 7.3 we see that this equation corresponds to the initial value problem described by the linear differential equation with constant coefficients

$$\begin{aligned} L y(t) = x_a(t) \end{aligned}$$

with

$$\begin{aligned} L &= D^m + a_{m-1}D^{m-1} + \cdots + a_0, \\ x_a(t) &= (b_nD^n + b_{n-1}D^{n-1} + \cdots + b_0) x(t) \end{aligned}$$

and zero initial conditions

$$\begin{aligned} (D^ky)(0) = 0, \quad k=0,\cdots ,m-1\,. \end{aligned}$$

For this reason \(y(t) = h(t) *x(t)\) is called the zero state responseof the system.

It is obvious that the procedure can be reversed. We have therefore established a one-to-one correspondence between systems described by a rational transfer function and systems described by a linear differential equation with constant coefficients and zero initial conditions.

If the transfer function H of the system is minimal, that is, if its numerator and its denominator are relatively prime polynomials, then, in the complement of \(t=0\), it is possible to recreate the same output that would be produced by solving the corresponding initial value problem with non-zero initial conditions. This is achieved by driving the system with an input signal consisting of a weighted sum of a Dirac pulse and its derivatives

$$\begin{aligned} x = x_{m-1} D^{m-1}\delta + \cdots + x_0\delta \end{aligned}$$

and by suitably selecting the weighting coefficients \(x_0,\ldots ,x_{m-1}\) as described in Sect. 7.3 (see Example 7.4). Such a system is said to have order m and to be observable and controllable (see Sect. 8.6).

If H(s) is a proper rational transfer function, that is if \(m<n\), then it can be expanded into a sum of partial fractions of the form

$$\begin{aligned} \frac{c_{jk_j}}{(s - p_j)^{k_j}}, \qquad k_j = 1,\ldots ,l_j \end{aligned}$$

with \(p_j\) the jth zero of P(s), \(l_j\) its multiplicity and \(c_{jk_j}\) constants. From Example 7.2 and the properties of the Laplace transform we therefore see that the impulse response h is the sum of products of polynomials and exponential functions. In particular, we see that the system is stable if the real part of the poles of H(s) are negative

$$\begin{aligned} \Re \{p_j\} < 0\,. \end{aligned}$$

If n is not smaller than m then H(s) can be decomposed into the sum of a polynomial and a proper rational function. The impulse response h is then the sum of the above polynomial-exponential functions and a weighted sum of Dirac impulses and its derivatives.

6 System State

In this section we review the concept of the state of a system. To this end consider the initial value problem described by the system of n differential equations

$$\begin{aligned} \frac{d}{dt} u = A u + x, \qquad u(0) = u_0 \in {\mathbb {C}}^n \end{aligned}$$

with \(A \in {\mathbb {C}}^{n\times n}\) an \(n \times n\) matrix and u and x n dimensional vectors of complex valued functions of time. As before we can translate this initial value problem in the language of distributions by replacing the (conventional) derivative with the distributional one and work in the convolution algebra of right sided distributions

$$\begin{aligned} Du = A u + u_o\delta + x\,. \end{aligned}$$

If we rearrange the equation and convolve each term with \(I\textsf{1}_{+}\) we obtain the equivalent equation

$$\begin{aligned} (I\delta - A\textsf{1}_{+}) *u = I\textsf{1}_{+}*(u_0\delta + x)\,. \end{aligned}$$
(8.8)

This form shows that the equation can be solved by left convolving both sides of the equation with the inverse of \((I\delta - A\textsf{1}_{+})\). Observing the analogy with the geometric series, provided it converges, the latter can be represented by the following series, where the standard product of the geometric series has been replaced by the convolution product

$$\begin{aligned} (I\delta - A\textsf{1}_{+})^{*-1} = I\delta + A\textsf{1}_{+}+ (A\textsf{1}_{+})^{*2} + \cdots \,. \end{aligned}$$

The iterated convolutions are easily evaluated

$$\begin{aligned} (A\textsf{1}_{+})^{*n} = A^n \textsf{1}_{+}^{*n} = A^n \frac{t^{n-1}}{(n-1)!} \end{aligned}$$

and using the identity

$$\begin{aligned} \textsf{1}_{+}^{*n} = \textsf{1}_{+}^{*n} *\delta = \textsf{1}_{+}^{*n} *\textsf{1}_{+}*D\delta = \textsf{1}_{+}^{*n+1} *D\delta \end{aligned}$$

we obtain

$$\begin{aligned} \begin{aligned} (I\delta - A\textsf{1}_{+})^{*-1} &= I\delta + \sum _{n=1}^\infty A^n \frac{t^{n-1}}{(n-1)!} = \sum _{n=0}^\infty A^n \frac{t^n}{n!} \textsf{1}_{+}*D\delta \,. \end{aligned} \end{aligned}$$

The last series can be expressed with the help of the  exponential matrix defined by

$$\begin{aligned} \mathrm{{e}}^{At} :=\sum _{n=0}^\infty A^n \frac{t^n}{n!} \end{aligned}$$
(8.9)

which converges for every value of t

$$\begin{aligned} \begin{aligned} (I\delta - A\textsf{1}_{+})^{*-1} &= \textsf{1}_{+}\mathrm{{e}}^{At} *D\delta \,. \end{aligned} \end{aligned}$$
(8.10)

Having established the convergence of the series, using the linearity and continuity of convolution one readily sees that indeed it defines the desired inverse

$$\begin{aligned} (I\delta - A\textsf{1}_{+}) *[I\delta + A\textsf{1}_{+}+ (A\textsf{1}_{+})^{*2} + \cdots ] = I\delta \,. \end{aligned}$$

The solution of the equation is therefore given by

$$\begin{aligned} u = \textsf{1}_{+}\mathrm{{e}}^{At} *I (D\delta *\textsf{1}_{+}) *(u_0\delta + x) = \textsf{1}_{+}\mathrm{{e}}^{At} u_0 + \textsf{1}_{+}\mathrm{{e}}^{At} *x\,. \end{aligned}$$
(8.11)

The exponential matrix has several useful properties that are immediately verified using its defining series

$$\begin{aligned} \mathrm{{e}}^{At} \mathrm{{e}}^{A\tau } &= \mathrm{{e}}^{A(t + \tau )} & \mathrm{{e}}^{A0} &= I\\ (\mathrm{{e}}^{At})^{-1} &= \mathrm{{e}}^{-At} & D\mathrm{{e}}^{At} &= A \mathrm{{e}}^{At} = \mathrm{{e}}^{At} A\,. \end{aligned}$$

Note however that in general

$$\begin{aligned} \mathrm{{e}}^A \mathrm{{e}}^B \ne \mathrm{{e}}^{A+B}. \end{aligned}$$

This is only valid if A and B commute, that is \(AB = BA\).

Consider now the state space representationof a SISO LTI system

$$\begin{aligned} Du &= A u + u_o\delta + B x, {} & {} A\in {\mathbb {C}}^{n \times n}, B\in {\mathbb {C}}^{n \times 1} \end{aligned}$$
(8.12)
$$\begin{aligned} y &= C u + D x {} & {} C\in {\mathbb {C}}^{1 \times n}, D\in {\mathbb {C}}\end{aligned}$$
(8.13)

where now x represents the input signal of the system and y its output. The vector u is called the stateof the system and (8.11) shows that it’s value \(u_0\) at a given point in time \(t_0\) is the minimum amount of information required that together with the input signal at times \(t \ge t_o\) allows determining the system behaviour at all future times \(t > t_0\). In other words, the system state \(u_0\) at time \(t_0\) summarises the effect on the system of all past values of the input signal and of previous states.

6.1 Controllability

It’s interesting to ask if it’s possible to design the input signal in such a way that the system can be set in an arbitrary state \(u_0\) in finite time. That is, can we design the input signal such that for \(t>t_0\) the state vector equals \(u(t) = \mathrm{{e}}^{At} u_0\)?

The problem is most easily analysed using impulsive inputs, starting from the zero state. From the above results we know that the system state dependence on the input signal x is given by

$$\begin{aligned} u = \textsf{1}_{+}\mathrm{{e}}^{At} B *x\,. \end{aligned}$$

Suppose that for an n dimensional system we use an input signal consisting of a weighted sum of a Dirac impulse and its derivatives up to order \(n-1\)

$$\begin{aligned} x = x_0\delta + x_1D\delta + \cdots + x_{n-1}D^{n-1}\delta \,. \end{aligned}$$

Since the system is linear, we can analyse the contribution of each term individually

$$\begin{aligned} \textsf{1}_{+}\mathrm{{e}}^{At} B *x_0\delta &= \textsf{1}_{+}\mathrm{{e}}^{At} B x_0\\ \textsf{1}_{+}\mathrm{{e}}^{At} B *x_1D\delta &= D(\textsf{1}_{+}\mathrm{{e}}^{At} B x_1) = \textsf{1}_{+}\mathrm{{e}}^{At} A B x_1 + \delta B x_1\\ & \cdots \\ \textsf{1}_{+}\mathrm{{e}}^{At} B *x_{n-1}D^{n-1}\delta &= D^{n-1}(\textsf{1}_{+}\mathrm{{e}}^{At} B x_{n-1}) = \textsf{1}_{+}\mathrm{{e}}^{At} A^{n-1} B x_{n-1} + \cdots \end{aligned}$$

The terms replaced by dots on the last line are constituted by a weighted sum of a Dirac impulse and its derivative which are zero for \(t>0\). Putting all terms together we obtain for \(t>0\)

$$\begin{aligned} \textsf{1}_{+}\mathrm{{e}}^{At} *x = \textsf{1}_{+}\mathrm{{e}}^{At} \begin{bmatrix} B & AB & \dots & A^{n-1}B \end{bmatrix} \cdot \begin{bmatrix} x_0 \\ x_1 \\ \vdots \\ x_{n-1} \end{bmatrix} \end{aligned}$$

From this we conclude that we can use a suitably designed input signal x to mimic the effect of an arbitrary initial state \(u_0\) if and only if the matrix

$$\begin{aligned} \mathcal {C} :=\begin{bmatrix} B & AB & \dots & A^{n-1}B \end{bmatrix} \end{aligned}$$
(8.14)

is invertible, in which case the weighting factors are

$$\begin{aligned} \begin{bmatrix} x_0 \\ x_1 \\ \vdots \\ x_{n-1} \end{bmatrix} = \mathcal {C}^{-1} u_0\,. \end{aligned}$$

The matrix \(\mathcal {C}\) is called controllability matrix.

While the state of a system plays an important theoretical and conceptual role, in practice, when dealing with controllable systems we can always start from the zero state and drive the system in any desirable state. Things are completely different for non-controllable systems. As discussed in Sect. 8.6.3, these are systems possessing sub-systems that are not influenced by the input signal. In those systems the initial state may play an important role.

6.2 Observability

Another interesting question is whether it’s possible to reconstruct the initial state of a system at time \(t_0\) from the observation of its output at times \(t > t_0\) assuming that ABCD and the input signal x are known. From linearity and knowledge of the input signal we can assume x to be zero. (Alternatively we could compute the part of the output signal due to the input signal—the zero state response of the system—and subtract it from the observed output.) The question is then if we can calculate \(u_o\) from the observation of

$$\begin{aligned} y = C \textsf{1}_{+}\mathrm{{e}}^{At} u_0. \end{aligned}$$

Suppose that the system is n dimensional. Then if we compute the first \(n-1\) derivatives of the output signal we obtain

$$\begin{aligned} Dy &= C \textsf{1}_{+}\mathrm{{e}}^{At} A u_0 + C \delta u_0\\ & \cdots \\ D^{n-1}y &= C \textsf{1}_{+}\mathrm{{e}}^{At} A^{n-1} u_0 + \cdots \end{aligned}$$

where in the last equation we have represented by dots a weighted sum of a Dirac pulse and its derivatives as before. Thus, the observation of the output signal and of its first \(n-1\) derivatives at times \(t>0\) allows setting up the following system of equations

$$\begin{aligned} \lim _{t\rightarrow 0+} \begin{bmatrix} y(t)\\ Dy(t)\\ \vdots \\ D^{n-1}y(t) \end{bmatrix} &= \begin{bmatrix} C\\ C A\\ \vdots \\ C A^{n-1} \end{bmatrix} \cdot u_0\,. \end{aligned}$$

This system of equations can only be solved for \(u_0\) if the matrix

$$\begin{aligned} \mathcal {O} :=\begin{bmatrix} C\\ C A\\ \vdots \\ C A^{n-1} \end{bmatrix} \end{aligned}$$
(8.15)

is not singular. The matrix \(\mathcal {O}\) is called the observability matrix.

6.3 Jordan Normal form

The simplest way to understand the structure of a system that is either not controllable, or not observable is by considering the system in Jordan normal form.

Consider a system in the state space representation

$$\begin{aligned} Du &= A u + B x, {} & {} \qquad \qquad A\in {\mathbb {C}}^{n \times n}, B\in {\mathbb {C}}^{n \times 1}\\ y &= C u {} & {} C\in {\mathbb {C}}^{1 \times n}\,. \end{aligned}$$

In linear algebra is shown that, by choosing a suitable basis, every linear operator can be represented by a matrix of the following block form, called the Jordan normal form

$$\begin{aligned} A = \begin{bmatrix} J_1 &{} &{} &{} 0\\ &{} J_2 &{} &{} \\ &{} &{} \ddots &{} \\ 0 &{} &{} &{} J_r \end{bmatrix} \end{aligned}$$

with

$$\begin{aligned} J_i = \begin{bmatrix} \lambda _i &{} 1 &{} &{} 0 \\ &{} \lambda _i &{} 1 &{} \\ &{} &{} \ddots &{} \ddots \\ 0 &{} &{} &{} \lambda _i \end{bmatrix} \end{aligned}$$

the elementary Jordan matrix. The diagonal elements of \(J_i\) correspond all to the ith eigenvalue \(\lambda _i\) of A. If \(n_i\) correspond to the algebraic multiplicity of eigenvector \(\lambda _i\) and \(\nu _i\) to its geometric multiplicity, then there are \(\nu _i\) Jordan blocks \(J_i\) corresponding to eigenvalue \(\lambda _i\). Thus, the total number of Jordan blocks corresponds to the number of independent eigenvectors of A. The Jordan normal form of a linear operator is unique up to permutations of the blocks.

A matrix for which the geometric multiplicity equals the algebraic multiplicity for each eigenvalue is called semi simple. In this case each block \(J_i\) is a \(1\times 1\) matrix and the Jordan normal form reduces to diagonal form.

Fig. 8.2
figure 2

Jordan normal form representation of a system

A system in Jordan normal form can be interpreted as the parallel connection of independent sub-systems, each represented by a Jordan block \(J_i\). Figure 8.2 shows the block diagram for a system with a simple eigenvalue \(\lambda _0\) and a double eivenvalue \(\lambda _1\) with \(\nu _1 = 1\). From the figure it’s easy to see that if \(b_0 = 0\) then the state variable \(u_0\) can’t be excited by the input signal x. The same is true for \(u_2\) if \(b_2 = 0\). In either case the system is not controllable. One can check that these are the two conditions under which the determinant of the matrix \(\mathcal {C}\) vanishes.

In a similar way the figure shows that if \(c_0 = 0\) there is no path from \(u_0\) to the output of the system and for \(c_1 = 0\) there is no path from \(u_1\). These are the two cases under which the system is not observable and correspond to the two conditions under which the determinant of the matrix \(\mathcal {O}\) vanishes.

Fig. 8.3
figure 3

a Not controllable system. b Not observable system

From these considerations we conclude that a non-observable system includes a sub-system whose output does not reach the global system output as schematically depicted in Fig. 8.3b. A non-controllable system includes a sub-system that is not reached by the input signal as schematically depicted in Fig. 8.3a.

Example 8.2: Jordan Block

Consider the system described by the following state-space representation

$$\begin{aligned} Du &= A u + B x\\ y &= C u \end{aligned}$$

with

$$\begin{aligned} A &= \begin{bmatrix} \omega _{3dB} &{} 1\\ 0 &{} \omega _{3dB} \end{bmatrix}, & B &= \begin{bmatrix} b_0 \\ b_1 \end{bmatrix}\,, & C &= \begin{bmatrix} c_0 & c_1 \end{bmatrix}\,. \end{aligned}$$

We want to compute an explicit expression for the exponential matrix \(\mathrm{{e}}^{t A}\) allowing us to compute the response of the system to an arbitrary input signal x.

The matrix

$$\begin{aligned} A = \begin{bmatrix} \omega _{3dB} &{} 1\\ 0 &{} \omega _{3dB} \end{bmatrix} \end{aligned}$$

is an elementary Jordan matrix and can’t be transformed in a diagonal matrix by a similarity transformation. In fact, as can be seen from the characteristic polynomial

$$\begin{aligned} \det (A - \lambda I) = (\omega _{3dB} - \lambda )^2\,, \end{aligned}$$

the matrix has a single eigenvalue \(\lambda = \omega _{3dB}\) with an algebraic multiplicity of 2 and the eigenspace belonging to this eigenvalue has dimension 1

$$\begin{aligned} \bigl (A - \omega _{3dB} I\bigr ) v = \begin{bmatrix} 0 &{} 1\\ 0 &{} 0 \end{bmatrix} v = 0 \qquad \implies \qquad v = \alpha \begin{bmatrix} 1 \\ 0 \end{bmatrix}\,, \quad \alpha \in {\mathbb {C}}\,. \end{aligned}$$

The matrix A can however be written as the sum of a diagonal matrix \(A_d\) and a particularly simple matrix \(A_c\)

$$\begin{aligned} A = A_d + A_c = \begin{bmatrix} \omega _{3dB} &{} 0\\ 0 &{} \omega _{3dB} \end{bmatrix} + \begin{bmatrix} 0 &{} 1\\ 0 &{} 0 \end{bmatrix}\,. \end{aligned}$$

Observe that the matrices \(A_d\) and \(A_c\) do commute. For this reason we can use the following property of the exponential matrix

$$\begin{aligned} \mathrm{{e}}^{t (A_d + A_c)} = \mathrm{{e}}^{t A_d} \mathrm{{e}}^{t A_c}\,. \end{aligned}$$

Since \(A_d\) is diagonal, the first exponential matrix \(\mathrm{{e}}^{t A_d}\) is easily calculated to be

$$\begin{aligned} \mathrm{{e}}^{t A_d} = \mathrm{{e}}^{\omega _{3dB} t} I\,. \end{aligned}$$

The second exponential matrix \(\mathrm{{e}}^{t A_c}\) is easily calculated from the series defining the exponential matrix by noting that the square of the matrix \(A_c\) vanishes

$$\begin{aligned} \mathrm{{e}}^{t A_c} = I + t A_c\,. \end{aligned}$$

Putting these results together we obtain

$$\begin{aligned} \mathrm{{e}}^{t A} = \mathrm{{e}}^{\omega _{3dB} t} \left( \begin{bmatrix} 1 &{} 0\\ 0 &{} 1 \end{bmatrix} + \begin{bmatrix} 0 &{} t\\ 0 &{} 0 \end{bmatrix} \right) = \mathrm{{e}}^{\omega _{3dB} t} \begin{bmatrix} 1 &{} t\\ 0 &{} 1 \end{bmatrix}\,. \end{aligned}$$

The above method can be used to calculate the exponential of any elementary Jordan matrix with the only modification that for an \(n\times n\) matrix A it is the nth power of the matrix \(A_c\) that vanishes.

In the following we are always going to assume that the systems under consideration are controllable and observable.