1 Introduction

As outlined in Chap. 1, the behavior of nonlinear systems is substantially richer than the one of linear systems. To deal with them there is a set of techniques, each one best suited to analyse particular aspects or particular classes of nonlinear systems. We target systems that are stable about an equilibrium point and that depend continuously on the input signal.

Before analysing in more details this class of systems, we give a short overview, mostly by way of examples, of systems described by nonlinear ordinary differential equations of the form

$$\begin{aligned} Dy = f(t,y), \qquad f: I\times X \rightarrow {\mathbb {R}}^n \end{aligned}$$
(9.1a)

with \(I\subset {\mathbb {R}}\), \(X\subset {\mathbb {R}}^n\) and initial conditions

$$\begin{aligned} y(0) = y_0 \in X\,. \end{aligned}$$
(9.1b)

We limit ourselves to the aspects that are helpful in better framing the concept of weakly nonlinear systems.

A first important difference compared to systems described by linear differential equations with constant coefficients is the fact that a solution may not exist for all \(t > 0\) or may not be unique.

Example 9.1: IVP with many Solutions

Consider the following initial value problem (IVP)

$$\begin{aligned} Dy = \sqrt{|y |} \qquad y(0) = y_0\,. \end{aligned}$$

If \(y_0>0\) then the equation can be solved by the method of separation of the variables, and we obtain the unique solution

$$\begin{aligned} y(t) = \frac{1}{4}(t + 2\sqrt{y_o})^2\,, \qquad t \ge 0\,. \end{aligned}$$

If \(y_0=0\) then \(y(t) = 0\) is a solution. However, it is not the only one. For any constant \(c>0\) the function

$$\begin{aligned} y_c(t) = \frac{\textsf{1}_{+}(t - c)}{4} (t - c)^2\,, \qquad t \ge 0 \end{aligned}$$

is also a solution as one easily verifies by inserting it in the equation.

For \(y_0<0\) we can again use the method of the separation of the variables to find the solution

$$\begin{aligned} y(t) = -\frac{1}{4}(2\sqrt{|y_0 |} - t)^2\,. \end{aligned}$$

However, due to the fact that at \(y=0\) the function \(1/\sqrt{|y |}\) is not continuous (not even defined) this solution is only valid as long as \(y(t)<0\). When y(t) reaches zero the equation can again be satisfied by multiple solutions

$$\begin{aligned} y_c(t) = {\left\{ \begin{array}{ll} -\frac{1}{4}(2\sqrt{|y_0 |} - t)^2&{} t \in [0,2\sqrt{|y_0 |})\\ 0&{} t \in [2\sqrt{|y_0 |},c)\\ \frac{1}{4}(t - c)^2&{} t \in [c,\infty )\,. \end{array}\right. } \end{aligned}$$

Therefore, for some initial conditions the equation has uncountably many solutions (Fig. 9.1).

Fig. 9.1
figure 1

Two solutions of the initial value problem of Example 9.1 with \(y_0=-0.5\)

From the above example we see that continuity of f is not enough to guarantee the existence of a unique solution of the initial value problem (9.1a). To guarantee uniqueness of a solution the function f(ty) must be more regular with respect to y.

Let \(I\subset {\mathbb {R}}\) and \(X\subset {\mathbb {R}}^n\). A function \(f\in C(I\times X,{\mathbb {R}}^n)\) is called locally Lipschitz continuous in x if every point \((t_0,y_0) \in I\times X\) has a neighborhood \(U\times V\) such that, for some constant \(M>0\)

$$\begin{aligned} \Vert f(t,y) - f(t,x)\Vert \le M \Vert y - x\Vert \,, \qquad t \in U,\quad x,y \in V\,. \end{aligned}$$

If the function f(ty) in (9.1b) is continuous in t and locally Lipschitz continuous in y then Picard-Lindelöf’s theorem guarantees the existence and uniqueness of the solution of the initial value problem (9.1a) [23].

If the function f doesn’t depend explicitly on time, then the system is time invariant and the system equation becomes

$$\begin{aligned} Dy = f(y)\,, \qquad f: X \rightarrow {\mathbb {R}}^n,\quad X\subset {\mathbb {R}}^n\,. \end{aligned}$$
(9.2)

A solution of the equation for which \(Dy = 0\) is called an equilibrium point of the system. When one investigates the stability of an equilibrium point \(y_e\) one can always assume it to be at the origin. In fact, by the change of variable \(u = y - y_e\) one can always transform the system differential equation in one whose equilibrium point of interest is \(u_e=0\)

$$\begin{aligned} Du = D(u + y_e) = f(u + y_e) =:g(u). \end{aligned}$$

An equilibrium point is stable if for each \(c > 0\) one can find an \(\epsilon > 0\) such that

$$\begin{aligned} \Vert y(t_0)\Vert < \epsilon \quad \implies \quad \Vert y(t)\Vert < c,\quad t \ge t_0\,. \end{aligned}$$

It is asymptotically stable if it is stable and in addition \(\epsilon \) can be chosen such that

$$\begin{aligned} \Vert y(t_0)\Vert < \epsilon \quad \implies \quad \lim _{t\rightarrow \infty } \Vert y(t)\Vert = 0\,. \end{aligned}$$

The set of all points \(y(t_0)\) such that \(\Vert y(t)\Vert \) converges to zero as t tends to infinity is called the domain of attraction of the equilibrium point. If an equilibrium point is not stable it is called unstable.

As already highlighted in Chap. 1, an important difference of time invariant nonlinear systems compared to LTI ones is the possibility of the existence of multiple isolated equilibrium points.

Example 9.2

Consider the system described by the following differential equation

$$\begin{aligned} Dy = -a y + c y^2 \end{aligned}$$

with a and c positive constants. From

$$\begin{aligned} 0 = -a y + c y^2 = c y (y - a/c) \end{aligned}$$

we see that the system has two equilibrium points:

$$\begin{aligned} y(t) = 0 \qquad \text {and}\qquad y(t) = a/c\,. \end{aligned}$$

We are interested in the dynamic of the system starting from the initial condition \(y(0) = y_0\) assuming that \(y_0\) doesn’t coincide with an equilibrium point. Since the function \(f(y) = -a y + c y^2\) is locally Lipschitz continuous, there is a unique solution and this solution doesn’t intersect the equilibrium points. The initial value problem can therefore be solved by separating the variables and integrating

$$\begin{aligned} \int \limits _{y_0}^y \frac{dy}{c y(y - a/c)} = \int \limits _0^t dt\,. \end{aligned}$$

The solution is found to be

$$\begin{aligned} y(t) = y_0 \frac{\textrm{e}^{-at}}{1 - y_0\frac{c}{a}(1 - \textrm{e}^{-at})}\,. \end{aligned}$$

If \(y_0\) is negative or  \(0 < y_0c/a < 1\) the solution converges toward zero which therefore is an asymptotically stable equilibrium point (see Fig. 9.2). If \(y_0c/a > 1\) the solution diverges and reaches infinity in the finite time

$$\begin{aligned} t_\infty = \frac{1}{a}\ln \left( \frac{1}{1-\frac{a}{y_0c}}\right) \,. \end{aligned}$$
Fig. 9.2
figure 2

Solutions for various initial conditions of the initial value problem of Example 9.2 with \(a=1\) and \(c=1/2\)

From the above example we see that a nonlinear system can have multiple equilibrium points some of which can be stable, and some unstable. For a system to remain stable around a stable equilibrium point the initial condition may have to remain within a limited region around that point. Also, divergence from initial conditions near unstable equilibrium points can diverge faster than exponentially and reach infinity in finite time (finite escape time).

One of the most useful tools in the study of the stability of equilibrium points is the Lyapunov stability theory [24]. In particular Lyapunov’s linearization (or indirect) method, states that

  • If the linear approximation of the system about an equilibrium point is asymptotically stable then, in a neighborhood U of the equilibrium point, the (nonlinear) system is asymptotically stable. The largest neighborhood U is the domain of attraction of the equilibrium point.

  • If the linear approximation of the system about an equilibrium point is unstable, then the (nonlinear) system is unstable.

If the linear approximation of the system is neither asymptotically stable nor unstable then this method is inconclusive and one must turn to other methods, for example, Lyapunov’s direct method [24].

Example 9.3

Consider the initial value problem described by the differential equation

$$\begin{aligned} Dy = c y^3 \end{aligned}$$

with c a constant; and the initial condition

$$\begin{aligned} y(0) = y_0\,. \end{aligned}$$

The only equilibrium point of the equation is the zero solution \(y_e(t) = 0\). As it’s immediately seen, the linearized equation is stable, but not asymptotically stable about the equilibrium point.

The nonlinear equation can be solved by the method of the separation of the variables

$$\begin{aligned} \int \limits _{y_0}^y \frac{dy}{y^3} = c \int \limits _0^t dt\,. \end{aligned}$$

Performing the integrations and solving for y we find

$$\begin{aligned} y(t) = \frac{y_0}{\sqrt{1 - 2 c y_0^2 t}}\,. \end{aligned}$$

If \(c > 0\) the solution diverges and reaches infinity at

$$\begin{aligned} t_\infty = \frac{1}{2 c y_0^2}\,. \end{aligned}$$

If \(c < 0\) the equation is asymptotically stable for any value of the initial value \(y_0\) (see Fig. 9.3).

Differently from what the above examples may suggest, most nonlinear differential equations can’t be solved analytically. Therefore we are interested in methods to find approximate solutions around asymptotically stable equilibrium points in the spirit of a perturbation theory. Weakly nonlinear systems are a class of systems for which such a method exists and the solution is obtained in the form of a functional series.

Fig. 9.3
figure 3

Solutions for various initial conditions of the initial value problem of Example 9.3 with \(c=-1\)

Informally weakly nonlinear systems can be described as systems operated around an asymptotically stable equilibrium point and whose response depends continuously on the input signal x. They include systems described by a differential equation of the form

$$\begin{aligned} \begin{array}{cc} Dy = C x + f(y)\,, \\[5pt] f: Y \rightarrow {\mathbb {R}}^n, \quad C: X \rightarrow {\mathbb {R}}^n\,, \quad X\subset {\mathbb {R}}\,, \quad Y\subset {\mathbb {R}}^n \end{array} \end{aligned}$$

with C a linear function and f a function that, within the excursion range of interest of y, can be approximated to any desired accuracy by a Taylor expansion. Note that polynomials are locally Lipschitz continuous. For this reason weakly nonlinear systems are well-behaved and produce a well-defined and unique output response.

2 Graded Algebra of Test Functions

In the previous section we illustrated some aspects of weakly nonlinear systems based on examples of systems described by nonlinear differential equations. We now look for a description based on distributions. We’ll see that this allows reducing the problem of solving some classes of nonlinear differential equations to an essentially algebraic problem. However, before discussing systems, we need some preparation that we provide in this and the next section.

Let \(V_k, k \in {\mathbb {N}}\) be vector spaces on \({\mathbb {C}}\) such that \(V_k \cap V_j = \{0\}\) for \(k \ne j\). The direct sum

$$\begin{aligned} V :=\bigoplus _{k=0}^\infty V_k :=\bigoplus _{k \ge 0} V_k \end{aligned}$$
(9.3)

is the vector space whose elements are the sequences \((x_k)\) in \(\bigcup _{k=0}^\infty V_k\) with \(x_k \in V_k\) and \(x_k = 0\) for fast every k. That is, the set of all finite sequences with \(x_k \in V_k\). The vector space structure of V is defined by the following addition and multiplication with scalars

$$\begin{aligned} (x_k) + c (y_k) :=(x_k + c y_k), \qquad (x_k),(y_k) \in V\,, \quad c \in {\mathbb {C}}\,. \end{aligned}$$
(9.4)

Each \(V_k\) is evidently a sub-vector space of V.

If furthermore V is provided with a multiplication

$$\begin{aligned} V \times V \rightarrow V, \qquad (x,y) \mapsto x \odot y \end{aligned}$$

such that it forms an algebra and in addition

$$\begin{aligned} V_k \odot V_j \subset V_{k+j}\,, \qquad k,j \in {\mathbb {N}}\end{aligned}$$

then it is called a graded algebra.

Let \(V_k = {\mathcal {D}}({\mathbb {R}}^k)\) be the vector space of test functions on \({\mathbb {R}}^k\) with \(V_0 = {\mathbb {C}}\). Then 

$$\begin{aligned} {\mathcal {D}}_{\oplus }:=\bigoplus _{k\ge 0} {\mathcal {D}}({\mathbb {R}}^k) \end{aligned}$$

with the tensor product as multiplication

$$\begin{aligned} \phi \otimes \psi (\tau _1,\ldots ,\tau _k,\tau _{k+1},\ldots ,\tau _{k+j}) :=\phi (\tau _1,\ldots ,\tau _k) \psi (\tau _{k+1},\ldots ,\tau _{k+j}) \end{aligned}$$

is a graded algebra that we call the graded algebra of test functions. We write elements of \({\mathcal {D}}_{\oplus }\) as sums with indices denoting the grade of the element

$$\begin{aligned} \phi = \sum _{j=0}^N \phi _j\,, \qquad \phi _j \in {\mathcal {D}}({\mathbb {R}}^j)\,, \quad N \in {\mathbb {N}}\,. \end{aligned}$$

In the graded algebra of test functions we define the following convergence criterion. A sequence \((\phi _m), \phi _m\in {\mathcal {D}}_{\oplus }\) with

$$\begin{aligned} \phi _m = \sum _{j=0}^{N_m} \phi _{j,m}\,, \qquad \phi _{j,m} \in {\mathcal {D}}({\mathbb {R}}^j) \end{aligned}$$

converges to zero if

  1. 1.

    There exist compact sets \(K_j\subset {\mathbb {R}}^j\), \(j=1,\ldots ,N\) with \(N = \max _{m \in {\mathbb {N}}}(N_m)\) such that for each j and m

    $$\begin{aligned} \text {supp}(\phi _{j,m}) \subset K_j\,. \end{aligned}$$
  2. 2.

    For every \(j>0\) and every j-tuple \(k \in {\mathbb {N}}^j\) the sequence \((D^k\phi _{j,m})_{m\in {\mathbb {N}}}\) converges uniformly to zero. For \(j=0\) the sequence of numbers \((\phi _{0,m})_{m\in {\mathbb {N}}}\) converges to zero.

3 Direct Product of Distributions

The direct product V of vector spaces \(V_k\) on \({\mathbb {C}}\) is the vector space whose elements are the sequences \((x_k)\) with \(x_k\in V_k, k\in {\mathbb {N}}\). The vector space structure is defined as for the direct sum by (9.4). It is denoted by

$$\begin{aligned} V :=\prod _{k \ge 0} V_k :=\prod _{k=0}^\infty V_k\,. \end{aligned}$$
(9.5)

The key difference from the direct sum is that, in a direct product, the sequence does not have to be finite.

Let \(V_k={\mathcal {D}}'({\mathbb {R}}^k)\), with \(V_0={\mathbb {C}}\). Then the direct product 

$$\begin{aligned} {\mathcal {D}}_{\oplus }' :=\prod _{k\ge 0} {\mathcal {D}}'({\mathbb {R}}^k) \end{aligned}$$

is the set of linear continuous functionals on \({\mathcal {D}}_{\oplus }\) defined by

$$\begin{aligned} h: {\mathcal {D}}_{\oplus }\rightarrow {\mathbb {C}}, \qquad \phi \mapsto \langle h,\phi \rangle :=\sum _{j=0}^\infty \langle h_j,\phi _j \rangle \end{aligned}$$
(9.6)

with

$$\begin{aligned} \phi = \sum _{j=0}^\infty \phi _j\,, \qquad h = \sum _{j=0}^\infty h_j\,, \qquad \phi _j\in {\mathcal {D}}({\mathbb {R}}^j), \quad h_j\in {\mathcal {D}}'({\mathbb {R}}^j)\,. \end{aligned}$$

Since \(\phi \) only has a finite number of terms different from zero, \(\langle h,\phi \rangle \) is well-defined. As for \(k \ne j\), \({\mathcal {D}}'({\mathbb {R}}^k) \cap {\mathcal {D}}'({\mathbb {R}}^j) = \{0\}\), here and in the following we denote elements of \({\mathcal {D}}_{\oplus }'\) by sums in a similar way as we do for elements of \({\mathcal {D}}_{\oplus }\).

Continuity in \({\mathcal {D}}'_\oplus \) is defined by the convergence that we defined for \({\mathcal {D}}_\oplus \) and follows from the continuity of distributions. Since \({\mathcal {D}}'_\oplus \) is a vector space, it’s enough to verify continuity at the origin. Let \(h \in {\mathcal {D}}'_\oplus \) and \(\phi \in {\mathcal {D}}_{\oplus }\), then there exists an \(N \in {\mathbb {N}}\) such that

$$\begin{aligned} |\langle h,\phi \rangle | \le \sum _{j=0}^N |\langle h_j,\phi _j \rangle | \le (N+1) \sup _{j \in \{0,\ldots ,N\}} |\langle h_j,\phi _j \rangle | \end{aligned}$$

and according to our definition of convergence, when \(\phi \) converges to zero, so does \(\sup _j|\langle h_j,\phi _j \rangle |\) and hence \(\langle h,\phi \rangle \).

In Sect. 3.1 we have introduced the tensor product of distributions and have seen that it is well defined between any pair of distributions. With it we can define a product \(g \cdot h\) between elements g and h of \({\mathcal {D}}'_\oplus \). It’s kth component is defined by

$$\begin{aligned} (g h)_k :=(g \cdot h)_k :=\sum _{j=0}^k g_j \otimes h_{k - j}\,, \qquad k \in {\mathbb {N}}\end{aligned}$$
(9.7)

with \(g_j\) and \(h_j\) the jth components of g and h respectively. With this product \(({\mathcal {D}}'_\oplus , +, \cdot )\) becomes an algebra. As is common practice, we will often denote \(g \cdot h\) simply by gh. Being based on an associative operation (the tensor product) the product that we just defined is associative.

Note the close similarity between the algebra of formal power series and the one that we have defined for \({\mathcal {D}}'_\oplus \). In both cases addition is defined component wise and the product has the form of a convolution.

4 Symmetric Distributions

Let \(\mathsf {S_{k}}\) denote the set of all permutations of \(\{1,\ldots ,k\}\). A distribution \(h_k\in {\mathcal {D}}'({\mathbb {R}}^k)\) is symmetric if

$$\begin{aligned} \langle h_k,\phi (\tau _{\sigma (1)},\ldots ,\tau _{\sigma (k)}) \rangle = \langle h_k,\phi (\tau _1,\ldots ,\tau _k) \rangle \end{aligned}$$
(9.8)

for all permutations \(\sigma \in \mathsf {S_{k}}\) and every \(\phi \in {\mathcal {D}}({\mathbb {R}}^k)\). Symmetric distributions are fully characterized by symmetric test functions for

$$\begin{aligned} \langle h_k,\phi (\tau _1,\ldots ,\tau _k) \rangle = \left\langle {h_k},{\frac{1}{k!} \sum _{\sigma \in \mathsf {S_{k}}} \phi (\tau _{\sigma (1)},\ldots ,\tau _{\sigma (k)})}\right\rangle \end{aligned}$$

and the sum of test functions on the right-hand side is a symmetric test function. The sum of symmetric distributions is a symmetric distribution. Therefore, they form a vector subspace of distributions that we denote by \({\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\). Similarly, we denote the vector subspace of all symmetric test functions on \({\mathbb {R}}^k\) by \({\mathcal {D}}_\text {sym}({\mathbb {R}}^k)\), the one of the direct sum of symmetric test functions by \({\mathcal {D}}_{\oplus ,\text {sym}}({\mathbb {R}}^k)\) and the one of the direct product of symmetric distributions by  \({\mathcal {D}}'_{\oplus ,\text {sym}}({\mathbb {R}}^k)\).

A symmetric distribution can be constructed from an arbitrary distribution \(f \in {\mathcal {D}}'({\mathbb {R}}^k)\) by averaging over all permutations of the independent variables 

$$\begin{aligned} \left[ f\right] _{\text {sym}} :=\frac{1}{k!} \sum _{\sigma \in \mathsf {S_{k}}} f(\tau _{\sigma (1)},\ldots ,\tau _{\sigma (k)}) \end{aligned}$$
(9.9)

with

$$\begin{aligned} \langle f(\tau _{\sigma (1)},\ldots ,\tau _{\sigma (k)}),\phi (\tau _1,\ldots ,\tau _k) \rangle :=\langle f(\tau _1,\ldots ,\tau _k),\phi (\tau _{\sigma (1)},\ldots ,\tau _{\sigma (k)}) \rangle \,. \end{aligned}$$

Such an operation is called symmetrisation.

The tensor product is a bi-linear operation. Therefore, the power of an element of \({\mathcal {D}}'_\oplus \) composed by a finite number of distributions \(f_j \in {\mathcal {D}}'({\mathbb {R}}^{n_j})\), \(n_j \ge 1\), \(j=1,\ldots ,m\), \(m \ge 2\) can be expressed as a sum of tensor products

$$\begin{aligned} \left( \sum _{j=1}^m f_j \right) ^k = \sum _{j_1=1}^m\cdots \sum _{j_k=1}^m\, f_{j_1}\otimes \cdots \otimes f_{j_k}, \quad k \in {\mathbb {N}}\end{aligned}$$

with the sum ranging over all possible combinations of the indexes \(j_1,\ldots ,j_k\). If the distributions \(f_1,\ldots ,f_m\) are symmetric then one can reorder the indexes \(j_1,\ldots ,j_k\) by any permutation \(\sigma \) without changing the value of the sum. Hence, the tensor products on the right-hand side can be replaced by symmetrized products

$$\begin{aligned} \sum _{j_1=1}^m\cdots \sum _{j_k=1}^m\, f_{j_1}\otimes \cdots \otimes f_{j_k} = \sum _{j_1=1}^m\cdots \sum _{j_k=1}^m\, \left[ f_{j_1}\otimes \cdots \otimes f_{j_k}\right] _{\text {sym}}\,. \end{aligned}$$

The tensor product of symmetric distributions inside the symmetrisation operator act as a commutative operator. For this reason the sum includes summands that are equal and, by grouping them, we obtain an expression that is similar to the multinomial formula [21]

$$\begin{aligned} \left( \sum _{j=1}^m f_j \right) ^k = \sum _{|\alpha | = k} \frac{k!}{\alpha !} \left[ f^{\otimes \alpha }\right] _{\text {sym}}\,, \qquad f = (f_1,\ldots ,f_m) \end{aligned}$$
(9.10)

with \(\alpha \) an m-tuple in \({\mathbb {N}}^m\),

$$\begin{aligned} f^{\otimes \alpha } :=f_1^{\otimes \alpha _1} \otimes \cdots \otimes f_m^{\otimes \alpha _m} \end{aligned}$$
(9.11)

and where we made use of the multi-index notation introduced in Sect. 4.6.

In general the product that we defined on \({\mathcal {D}}'_{\oplus }\) applied to two elements of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) does not result in an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\). This can be remedied by symmetrizing the product

$$\begin{aligned} (g h)_k :=(g \cdot h)_k :=\sum _{j=0}^k \left[ g_j \otimes h_{k - j}\right] _{\text {sym}}\,, \qquad g, h \in {\mathcal {D}}'_{\oplus ,\text {sym}}\,. \end{aligned}$$
(9.12)

Unless explicitly stated otherwise, when working in \({\mathcal {D}}'_{\oplus ,\text {sym}}\) we will always assume the use of this symmetrized product.

The last property of symmetric distributions that we want to mention is the fact that, in a convolution algebra, the inverse of a symmetric distribution is symmetric, for

$$\begin{aligned} \begin{aligned} \delta (\tau _1,\tau _2) &=f(\tau _1,\tau _2) *f^{*-1}(\tau _1,\tau _2) \\ &= f(\tau _2,\tau _1) *f^{*-1}(\tau _1,\tau _2)\\ &= f(\tau _1,\tau _2) *f^{*-1}(\tau _2,\tau _1)\,. \end{aligned} \end{aligned}$$

5 Weakly Nonlinear Systems

We are looking for a representation, in the spirit of a perturbation theory, of a class of nonlinear systems including the ones described by differential equations of the form

$$\begin{aligned} L y = x + \sum _{k=2}^K c_k y^k \end{aligned}$$
(9.13)

with \(x\in {\mathcal {D}}'({\mathbb {R}})\) a given input signal, L a linear differential operator with constant coefficients

$$\begin{aligned} L = D^m + a_{m-1}D^{m-1} + \cdots + a_1D+ a_0 \end{aligned}$$

and where we assume that the linearized system is stable.

In Chap. 7 we saw that, in the language of distributions, a linear differential equation with constant coefficients becomes a convolution equation. If we want to apply the results obtained for convolution equations, we need to give a meaning to the nonlinear terms appearing in the above equation.

In general, it’s not possible to define a multiplication valid for arbitrary distributions. Therefore, the terms \(y^k, k > 1\) can’t be assumed to belong to \({\mathcal {D}}'({\mathbb {R}})\). To work around this problem we can assume y to belong to a direct product of distributions, \(y=(y_0, y_1, y_2, \dotsc )\), and use the product defined on that space. Since the product between functions with values in \({\mathbb {C}}\) is commutative \(f \cdot g = g \cdot f\), we require y to belong to the direct product of symmetric distributions \({\mathcal {D}}'_{\oplus ,\text {sym}}\). Then, if \(y_1\) is the solution of the linearized equation its powers become tensor powers

$$\begin{aligned} (y_1)^k = y_1^{\otimes k}\,. \end{aligned}$$

If \(y_1\) is a regular distribution, that is a locally integrable function, then we can recover the meaning of the powers in the differential equation by evaluating \(y_1^{\otimes k}\) on the diagonal

$$\begin{aligned} y_1^{\otimes k}(t, \dotsc , t) = y_1^k(t)\,. \end{aligned}$$

The same remains true if we replace \(y_1\) by a sum of distributions.

To complete the interpretation of the differential equation in the language of distributions it remains to be clarified what is the effect of the one dimensional differential operator \(D\) appearing in (9.13) on the components \(y_k \in {\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\) of y. To this end, suppose \(y_k\) to be a regular distribution. Then it is a locally integrable function

$$\begin{aligned} y_k: \tau \mapsto y_k(\tau _1,\ldots ,\tau _k)\,, \qquad \tau \in {\mathbb {R}}^k \end{aligned}$$

and we can associate with it a function of the single variable t by defining an operation that we call “evaluating on the diagonal” 

$$\begin{aligned} \mathrm {ev_{d}}(y_k) :=t \mapsto y_k(t,\ldots ,t)\,, \qquad t \in {\mathbb {R}}\,. \end{aligned}$$

If we assume this function to be differentiable, then the derivative with respect to t is well-defined

$$\begin{aligned} D\mathrm {ev_{d}}(y_k)(t) = D_1y_k(t,\ldots ,t) + \cdots + D_ky_k(t,\ldots ,t) \end{aligned}$$

and, as a distribution, can be represented by

$$\begin{aligned} D y_k :=\Bigg ( \sum _{j=0}^{k-1} \delta ^{\otimes j} \otimes D\delta \otimes \delta ^{\otimes k-1-j} \Bigg ) *y_k\,. \end{aligned}$$
(9.14)

This last expression is symmetric and is valid for arbitrary distributions. Therefore, we can take it as the definition of the effect of the differential operator \(D\) on distributions \(y_k\in {\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\). For \(y\in {\mathcal {D}}'_{\oplus ,\text {sym}}\) and any \(\phi \in {\mathcal {D}}_\otimes \), \(\langle y,\phi \rangle \) only has a finite number of terms different from zero. For this reason the effect of \(D\) on y can be defined as acting on each component individually.

For \(y\in {\mathcal {D}}'_{\oplus ,\text {sym}}\) to be a solution of (9.13) in a convolution algebra, the equation must be satisfied by each component \(y_k\) of y individually. If y has to be compatible with our assumption of the system being described around the zero equilibrium point, then the 0th component \(y_0\) must always be zero

$$\begin{aligned} y_0=0\,. \end{aligned}$$

In analogy with the theory of formal power series we call distributions \(y \in {\mathcal {D}}'_{\oplus }\) with \(y_0 = 0\) nonunits [25].

For \(k=1\) the only terms belonging to \({\mathcal {D}}'({\mathbb {R}})\) appearing in the equation are \(y_1\) and x. Hence, \(y_1\) is the solution of the linearized equation and, as discussed in Sect. 8.1, can be represented by

$$\begin{aligned} y_1 = h_1 *x\,. \end{aligned}$$

For \(k = 2\) we have

$$\begin{aligned} L\delta *y_2 = c_2 \, y_1^{\otimes 2} \qquad \delta \in {\mathcal {D}}'({\mathbb {R}}^2) \end{aligned}$$

and we see that, for the computation of \(y_2\), the tensor power of \(y_1\) plays the role of an input signal applied to a linear system. Assuming that \(L\delta \) has an inverse, we obtain

$$\begin{aligned} y_2 = c_2\, (L\delta )^{*-1}*y_1^{\otimes 2}\,. \end{aligned}$$

The above expression can be further manipulated by noting that

$$\begin{aligned} & \langle \left( a(\tau _1) \otimes b(\tau _2)\right) *\left( f(\tau _1) \otimes g(\tau _2)\right) ,\phi (\tau _1,\tau _2) \rangle \\ & \qquad = \langle \left( a(\tau _1) \otimes b(\tau _2)\right) \otimes \left( f(\lambda _1) \otimes g(\lambda _2)\right) ,\phi (\tau _1+\lambda _1,\tau _2+\lambda _2) \rangle \\ & \qquad = \langle \left( a(\tau _1) *f(\tau _1)\right) \otimes \left( b(\tau _2) *g(\tau _2)\right) ,\phi (\tau _1,\tau _2) \rangle \end{aligned}$$

or

$$\begin{aligned} \left( a \otimes b\right) *\left( f \otimes g\right) = \left( a *f\right) \otimes \left( b *g\right) \,. \end{aligned}$$
(9.15)

With this expression and the solution found for \(y_1\) we can express \(y_2\) as

$$\begin{aligned} y_2 = h_2 *x^{\otimes 2}\,, \qquad h_2 :=c_2\, (L\delta )^{*-1} *h_1^{\otimes 2} \end{aligned}$$

where raising to a tensor power is assumed to have higher priority than convolution.

From this it is not difficult to see that every component \(y_k\) can be expressed as the convolution of a distribution \(h_k\) specific to the problem and the input signal x raised to the tensor power of k

$$\begin{aligned} y_k = h_k *x^{\otimes k}\,. \end{aligned}$$

We are therefore led to define a weakly nonlinear (or analytic) time-invariant (WNTI) system as a system \({\mathcal {H}}\) whose behavior around the zero equilibrium point can be described by an element h of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) such that, when driven by the input signal x, its output is given by

$$\begin{aligned} y = h[x] :=\sum _{k=1}^\infty h_k *x^{\otimes k}\,, \qquad h_k \in {\mathcal {A}}'_k\,, \quad x \in {\mathcal {A}}'_1 \end{aligned}$$
(9.16)

with \({\mathcal {A}}'_1\) a convolution algebra in \({\mathcal {D}}'({\mathbb {R}})\) and \({\mathcal {A}}'_k\) a convolution algebra in \({\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\) compatible with \({\mathcal {A}}_1\) and the tensor product. This means that, if \(x \in {\mathcal {A}}'_1\), then \(x^{\otimes k}\) must be an element of \({\mathcal {A}}'_k\). We denote such a set of convolution algebras by \({\mathcal {A}}'_{\oplus ,\text {sym}}\). The distribution \(h_k\) is called the kth order impulse response (or kernel) of the system. A block diagram representation of a weakly nonlinear system is shown in Fig. 9.4. Note that, if the input signal is multiplied by a constant \(c\in {\mathbb {C}}\), \(y_k\) is scaled by a factor of \(c^k\)

$$\begin{aligned} y_k = h_k *(c\,x)^{\otimes k} = c^k \left( h_k *x^{\otimes k}\right) \,. \end{aligned}$$
Fig. 9.4
figure 4

Block diagram representation of a time-invariant weakly nonlinear system. If the input signal is proportional to the constant \(c\in {\mathbb {C}}\), the output of the block characterized by the kth order impulse response \(h_k\) is proportional to \(c^k\)

The interpretation of the output of our definition of a weakly nonlinear system requires some comment as it doesn’t always represent a quantity that can be interpreted as a signal depending on time. Under the assumption that all involved distributions belong to a convolution algebra, then one can distinguish the following cases

  • If the impulse responses \(h_k\) as well as the input signal x are regular distributions and the convolutions \(h_k*x^{\otimes k}\) are well-defined (see Sect. 3.2) then all output components \(y_k\) are locally integrable functions. In this case we can evaluate the \(y_k\) on the diagonal

    $$\begin{aligned} & \mathrm {ev_{d}}(y_k)(t) = \mathrm {ev_{d}}(h_k *x^{\otimes k})(t) = \nonumber \\ &\qquad \int \limits _{-\infty }^\infty \cdots \int \limits _{-\infty }^\infty h_k(\tau _1,\ldots ,\tau _k) x(t - \tau _1)\cdot \cdots \cdot x(t - \tau _k) d\tau _1 \cdots d\tau _k \end{aligned}$$
    (9.17)

    and obtain an interpretation for the \(y_k\) as signals of time. If the input signal is scaled by the constant c, then, at each time t, the output \(\mathrm {ev_{d}}(y)(t)\) is seen to be a power series in c

    $$\begin{aligned} \mathrm {ev_{d}}(y)(t) :=\sum _{k=1}^\infty c^k \mathrm {ev_{d}}(h_k *x^{\otimes k})(t)\,. \end{aligned}$$

    If this series has a convergence radius greater than zero valid at all times, then \(\mathrm {ev_{d}}(y)\) represents a well-defined function of time and we have a clear procedure to interpret the output of the system.

  • If some or all of the impulse responses \(h_k\) are not regular, there is still a class of input signals for which all \(y_k\) are regular distributions. (Remember that the convolution of any distribution with a test function is an indefinitely differentiable function.) The system restricted to this class of input signals may still be evaluated on the diagonal to obtain a function of time \(\mathrm {ev_{d}}(y)\) as in the previous case.

  • If for no input signal (different from zero) there is a constant \(c > 0\) such that \(\mathrm {ev_{d}}(y)(t)\) remains finite at all times then the system can’t be represented using an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\).

Example 9.4: Polynomial System

In this example we consider a class of systems whose impulse responses are not regular.

Suppose that the output of a system \({\mathcal {H}}\) is represented by a nonlinear function f of the input signal x and that the function f can be adequately approximated by a Taylor polynomial around the origin

$$\begin{aligned} y = f(x) \approx \sum _{k=1}^K \frac{f^{(k)}(0)}{k!} x^k\,, \qquad f(0) = 0\,, \quad K>1\,. \end{aligned}$$
(9.18)

It is readily seen that such a system can be represented by the impulse responses

$$\begin{aligned} h_k = \frac{f^{(k)}(0)}{k!} \delta ^{\otimes k}\,, \qquad k=1,\ldots ,K\,. \end{aligned}$$

The response of the system to the input signal x as represented by these impulse responses is

$$\begin{aligned} y = h[x] = \sum _{k=1}^K \frac{f^{(k)}(0)}{k!} \delta ^{\otimes k} *x^{\otimes k}\,. \end{aligned}$$

If the input signal is not a regular distribution, for example if it is a Dirac pulse, then neither the initial representation given by (9.18), nor the evaluation on the diagonal \(\mathrm {ev_{d}}(h[\delta ])\) do have a meaning. In spite of this, the impulse responses and their outputs \(y_k\) are mathematically well-defined.

If the class of input signals is restricted to regular distributions then the output obtained from the representation in terms of impulse responses by evaluating on the diagonal \(\mathrm {ev_{d}}(h[x])\) agrees with the original one.

If f is analytic, then it can be represented by a power series (\(K\rightarrow \infty \)). In this case the output of the system is only well defined if the magnitude of the input signal \(|x(t) |\) remains smaller than the convergence radius of the series at all times.

Let \(h_k\) be the kth order impulse responses of the weakly nonlinear system \({\mathcal {H}}\) and x its input signal. In Sect. 3.3 we saw that an arbitrary distribution can be approximated to any desired accuracy by a finite sum of Dirac pulses. Hence, x can be approximated by

$$\begin{aligned} x \approx \sum _{j=1}^M a_j \,\delta (t - \lambda _j)\,, \qquad a_j \in {\mathbb {C}},\quad \lambda _j \in {\mathbb {R}}\end{aligned}$$

and the output of \(h_k\) by

$$\begin{aligned} \begin{aligned} y_k &\approx \sum _{j_1 = 1}^M \cdots \sum _{j_k = 1}^M h_k *a_{j_1} \delta (\tau _1 - \lambda _{j_1}) \otimes \cdots \otimes a_{j_k} \delta (\tau _k - \lambda _{j_k})\\ &= \sum _{j_1 = 1}^M \cdots \sum _{j_k = 1}^M a_{j_1} \cdot \cdots \cdot a_{j_k} h_k(\tau _1 - \lambda _{j_1},\ldots ,\tau _k - \lambda {j_k})\,. \end{aligned} \end{aligned}$$

This expression suggests the interpretation for \(h_k\) as that portion of the system defining how the response of the system depends on the combination of k simultaneous points in time of the input signal.

In addition, if we compare the expression representing the output at time t of the (causal) impulse response \(h_k\)

$$\begin{aligned} \mathrm {ev_{d}}(h_k *x^{\otimes k})(t) \end{aligned}$$

with the one of a polynomial system (see Example 9.4)

$$\begin{aligned} \mathrm {ev_{d}}(c_k\delta ^{\otimes k} *x^{\otimes k})(t) = c_k\,x^k(t) \end{aligned}$$

we see that, the output at time t of the latter only depends on the kth power of the current value of the input signal. In contrast to this, the output at time t of the former depends on all combinations of products of k (past) values of the input signal. The impulse responses \(h_k\) can thus be interpreted as the memory of the system. The given representation of weakly nonlinear systems can be seen as a generalization of the Taylor approximation method for memory-less systems to systems with memory. It is called the  Volterra functional series in honor of V. Volterra who first proposed it [5].

6 Nonlinear Transfer Functions

All impulse responses \(h_k\) of a causal weakly nonlinear system must vanish if any argument \(\tau _j\) is less than zero. This is most easily seen if we consider the case where the impulse responses as well as the input signal x are regular distributions, for then

$$\begin{aligned} \mathrm {ev_{d}}(y_k)(t) =& \\ & \int \limits _{-\infty }^\infty \cdots \int \limits _{-\infty }^\infty h_k(t - \tau _1,\ldots ,t - \tau _k) x(\tau _1)\cdot \cdots \cdot x(\tau _k) d\tau _1 \cdots d\tau _k\,. \end{aligned}$$

As every distribution is the limit of smooth functions, this must then be true for arbitrary distributions. The impulse responses of all orders of causal systems are therefore right-sided distributions.

The Laplace transform of the k-order impulse response \(h_k\) is called the nonlinear transfer function of order k

$$\begin{aligned} H_k(s_1,\ldots ,s_k) = \langle h_k(\tau _1,\ldots ,\tau _k),\textrm{e}^{-s_1\tau _1 - \cdots -s_k\tau _k} \rangle \,. \end{aligned}$$
(9.19)

Due to the symmetry of \(h_k\), it is a symmetric function of the variables \(s_1\) to \(s_k\)

$$\begin{aligned} H_k(s_1,\ldots ,s_k) = H_k(s_{\sigma (1)},\ldots ,s_{\sigma (k)})\,, \qquad \sigma \in \mathsf {S_{k}}\,. \end{aligned}$$
(9.20)

As the Laplace transform converts convolution products into ordinary products, the Laplace transform of \(y_k=h_k*x^{\otimes k}\) is

$$\begin{aligned} Y_k(s1,\ldots ,s_k) = H_k(s_1,\ldots ,s_k) X(s_1) \cdot \cdots \cdot X(s_k)\,. \end{aligned}$$

Just as with LTI systems, the many useful properties of the Laplace transform makes it a very valuable tool for solving convolution equations describing weakly nonlinear systems. In particular, on top of converting convolution products into ordinary multiplications, in their region of convergence, the Laplace transformed of distributions are holomorphic functions.

Consider a system described by a differential equation with constant coefficients of the type considered before

$$\begin{aligned} \begin{array}{cc} L y = N x + \sum _{j=2}^J c_jy^j\\ L = D^n + a_{n-1}D^{n-1} + \cdots + a_0 \\ N = b_mD^m + b_{m-1}D^{m-1} + \cdots + b_0. \end{array} \end{aligned}$$
(9.21)

The part of the corresponding convolution equation relevant for the calculation of \(y_k\), \(k>1\), is

$$\begin{aligned} (L \delta ^{\otimes k}) *y_k = \sum _{j=2}^k c_j (y_1+\cdots +y_{k-1})^j\,. \end{aligned}$$

As the Laplace transform of \(D\delta ^{\otimes k}\) is

$$\begin{aligned} {\mathcal {L}}\{D\delta ^{\otimes k}\}(s_1,\ldots ,s_k) = s_1 + \cdots + s_k \end{aligned}$$

(see (9.14)), the Laplace transform of \(L\delta ^{\otimes k}\) is a polynomial in \(s_1+\cdots +s_k\)

$$\begin{aligned} P(s_1+\cdots +s_k) = (s_1+\cdots +s_k)^n + a_{n-1}(s_1+\cdots +s_k)^{n-1} + \cdots + a_0\,. \end{aligned}$$

Note that the coefficients of this polynomial are the same for all k, including \(k=1\). The only difference between the various values of k is in the argument. If we factor it, we see that the denominator of \(H_k\) adds to the denominator of the lower order transfer functions \(H_j, j=1,\ldots ,k-1\) terms of the form

$$\begin{aligned} (s_1 + \cdots + s_k - p_j)^{l_j} \end{aligned}$$

with \(p_j\) the jth pole and \(l_j\) its multiplicity. If we assume \(H_k\) to be a proper rational function, then its partial fraction expansion will include terms of the form

$$\begin{aligned} \frac{F(s_1,\ldots ,s_{k-1})}{(s_1 + \cdots + s_k - p_j)^{l_j}} \end{aligned}$$

and similar ones where some of the variables \(s_1,\ldots ,s_{k-1}\) may be missing. If by the calculation of the inverse Laplace transform we start by inverse transforming with respect to \(s_k\) we obtain the expression

$$\begin{aligned} F(s_1,\ldots ,s_{k-1}) \, \tau _k^{l_j-1} \, \textrm{e}^{[p_j -(s_1 + \cdots + s_{k-1})] \tau _k} \, \textsf{1}_{+}(\tau _k)\,. \end{aligned}$$

By using the shifting property of the Laplace transform and denoting by f the inverse transform of F, the complete inverse transform of the above expression is

$$\begin{aligned} f(\tau _1- \tau _k, \ldots ,\tau _{k-1} - \tau _k) \otimes \tau _k^{l_j-1} \, \textrm{e}^{p_j \tau _k} \, \textsf{1}_{+}(\tau _k)\,. \end{aligned}$$

If \(H_k\) is not a proper rational function, then it can be decomposed into a polynomial and a proper rational function. The inverse Laplace transform of the polynomial part results in Dirac pulses and its derivatives.

This shows that, if the system under consideration can be described by a differential equation with constant coefficients of the indicated type, then, similarly to the first order impulse response \(h_1\), the higher order impulse responses are sums of Dirac pulses, their derivatives and products of polynomials and exponential functions in the variables \(\tau _1,\ldots ,\tau _k\). In addition, it also shows that, if the linear transfer function \(H_1(s_1)\) has all its poles in the left-hand side of the complex plane, then not only does the regular part of \(h_1\) (that is discarding the Dirac pulses and its derivatives) decay exponentially as its argument tends to infinity, but so also do the regular part of all higher order impulse responses \(h_k\). In particular, we see that all impulse responses are summable distributions

$$\begin{aligned} h_k \in {\mathcal {D}}_{L^1+}'({\mathbb {R}}^k)\,, \qquad k = 1,2,\ldots \,. \end{aligned}$$

In the following, unless explicitly stated otherwise, we are always going to assume the systems to be of this type.

Example 9.5

We revisit Example 9.2and find an approximate solution of the initial value problem

$$\begin{aligned} Dy = -a y + c y^2\,, \qquad y(0) = y_0\,, \qquad a, c > 0\,, \end{aligned}$$

valid around its zero equilibrium point.

As we saw, in translating an initial value problem into the language of distributions, the initial conditions become part of the equation which, in this case, comes to be

$$\begin{aligned} (D+ a) y = y_0\delta + c y^2\,. \end{aligned}$$

We can think of this equation as an equation describing a system driven by the input signal \(x = y_0\delta \). The solution of the equation y is an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) and has the form

$$\begin{aligned} y = \sum _{k=1}^\infty h_k *x^{\otimes k}\,. \end{aligned}$$

The system is therefore fully characterized if we find the impulse responses \(h_k\). The solution of the original problem is then found by multiplying each impulse response \(h_k\) by \(y_0^k\)

$$\begin{aligned} y_k = h_k y_0^k\,. \end{aligned}$$

To find the impulse responses we apply the input signal \(x = \delta \) and insert \(y = h\) into the equation. The equation is solved if it is satisfied by each component \(h_k\) of h individually. The component \(h_k\) can be computed from the equation and the impulse responses of lower order \(h_j, j=1,\ldots ,k-1\).

To find \(h_1\) we retain only terms of the equation belonging to \({\mathcal {D}}'({\mathbb {R}})\)

$$\begin{aligned} (D\delta + a\delta ) *h_1 = \delta \,. \end{aligned}$$

If we Laplace transform the equation we obtain

$$\begin{aligned} (s_1 + a) H_1(s_1) = 1 \end{aligned}$$

from which we immediately obtain the first order transfer function

$$\begin{aligned} H_1(s_1) = \frac{1}{s_1 + a} \end{aligned}$$

and, by inverse Laplace transformation, the first order impulse response

$$\begin{aligned} h_1(\tau _1) = \textsf{1}_{+}(\tau _1) \, \textrm{e}^{-a \tau _1}\,. \end{aligned}$$

The second order impulse response \(h_2\) is found by retaining in the equation only terms belonging to \({\mathcal {D}}'({\mathbb {R}}^2)\)

$$\begin{aligned} (D+ a)\delta ^{\otimes 2} *h_2 = c \, h_1^{\otimes 2}\,. \end{aligned}$$

From the Laplace transformed equation

$$\begin{aligned} (s_1 + s_2 + a) H_2(s_1,s_2) = c \, H_1(s_1) H_1(s_2) \end{aligned}$$

we immediately obtain the second order nonlinear transfer function

$$\begin{aligned} H_2(s_1,s_2) = \frac{c \, H_1(s_1) H_1(s_2)}{s_1 + s_2 + a}\,. \end{aligned}$$

Note that it’s often convenient to write higher-order transfer functions in terms of the first-order one. In this example

$$\begin{aligned} H_2(s_1,s_2) = c \, H_1(s_1 + s_2) H_1(s_1) H_1(s_2)\,. \end{aligned}$$

To obtain the second order impulse response we can inverse Laplace transform, first with respect to one Laplace variable, then with respect to the other one, and finally by symmetrizing the result. We first inverse transform with respect to \(s_2\) the expression

$$\begin{aligned} H_1(s_1 + s_2) H_1(s_2) = \frac{1}{[s_2 + (s_1 + a)](s_2 + a)}\,. \end{aligned}$$

Assuming \(s_1\ne 0\)Footnote 1 and expanding in partial fractions we find

$$\begin{aligned} \frac{1}{s_1} \left( 1 - \textrm{e}^ {- s_1 \, \tau _2 } \right) \textrm{e}^ {- a\,\tau _2} \,\textsf{1}_{+}(\tau _2)\,. \end{aligned}$$

We then combine this expression with the other factors of \(H_2\)

$$\begin{aligned} \frac{c}{(s_1 + a) s_1} \left( 1 - \textrm{e}^ {- s_1 \, \tau _2 } \right) \textrm{e}^ {- a\,\tau _2} \,\textsf{1}_{+}(\tau _2) \end{aligned}$$

and inverse transform with respect to \(s_1\). This can be done by expanding in partial fractions the first factor

$$\begin{aligned} {\mathcal {L}}^{-1}\left\{ \frac{c}{(s_1 + a) s_1}\right\} (\tau _1) = \frac{c}{a} \left( 1 - \textrm{e}^{-a \tau _1} \right) \textsf{1}_{+}(\tau _1) \end{aligned}$$

and by using the shifting property of the Laplace transform to find

$$\begin{aligned} \frac{c}{a} \left[ \left( 1 - \textrm{e}^{-a \tau _1}\right) \textsf{1}_{+}(\tau _1) - \left( 1 - \textrm{e}^{-a (\tau _1 - \tau _2)} \right) \textsf{1}_{+}(\tau _1 - \tau _2) \right] \textrm{e}^{-a \tau _2}\,\textsf{1}_{+}(\tau _2)\,. \end{aligned}$$

Note that this expression is not symmetric and that if we had first inverse transformed with respect to \(s_1\) and then to \(s_2\), we would have obtained an expression with \(\tau _1\) and \(\tau _2\) exchanged.

The second-order impulse response is obtained from the above expression by symmetrisation

$$\begin{aligned} h_2(\tau _1,\tau _2) = \left[ \frac{c}{a} \left[ \left( 1 - \textrm{e}^{-a \tau _1}\right) - \left( 1 - \textrm{e}^{-a (\tau _1 - \tau _2)} \right) \textsf{1}_{+}(\tau _1 - \tau _2) \right] \textrm{e}^{-a \tau _2}\right] _{\text {sym}} \end{aligned}$$

where we have suppressed the explicit Heavyside step functions with the understanding that the expression is zero if \(\tau _1 < 0\) or \(\tau _2 < 0\). As \(h_2\) is a regular distribution, it can be evaluated on the diagonal and we obtain

$$\begin{aligned} \mathrm {ev_{d}}(h_2)(t) = \frac{c}{a} \left( \textrm{e}^{-a t} - \textrm{e}^{-2 a t}\right) \,. \end{aligned}$$

The third order impulse response \(h_3\) is found by retaining only elements belonging to \({\mathcal {D}}'({\mathbb {R}}^3)\) in the equation. As a first step we write

$$\begin{aligned} (D+ a)\delta ^{\otimes 3} *h_3 = c \, (h_1 + h_2)^2 \end{aligned}$$

for no other term can produce distributions belonging to \({\mathcal {D}}'({\mathbb {R}}^3)\). The right hand side can be expanded with the help of (9.10) and, retaining only the terms of interest, we obtain

$$\begin{aligned} (D+ a)\delta ^{\otimes 3} *h_3 = 2 c \left[ h_1 \otimes h_2\right] _{\text {sym}}\,. \end{aligned}$$

The Laplace transformed equation is

$$\begin{aligned} (s_1 + s_2 + s_3 + a)H_3(s_1,s_2,s_3) = 2 c \left[ H_1(s_1) H_2(s_2,s_3)\right] _{\text {sym}} \end{aligned}$$

and with it the third order nonlinear transfer function is readily obtained

$$\begin{aligned} H_3(s_1,s_2,s_3) = 2 c \, H_1(s_1 + s_2 + s_3) \left[ H_1(s_1) H_2(s_2,s_3)\right] _{\text {sym}}\,. \end{aligned}$$

By expressing \(H_2\) in terms of \(H_1\) we can write \(H_3\) in terms of \(H_1\) alone

$$\begin{aligned} & H_3(s_1,s_2,s_3) = \frac{2}{3} c^2 \, H_1(s_1 + s_2 + s_3) H_1(s_1)H_1(s_2)H_1(s_3)\\ &\qquad \qquad \qquad \qquad \qquad \qquad \qquad \cdot \left[ H_1(s_1 + s_2) + H_1(s_1 + s_3) + H_1(s_2 + s_3) \right] \,. \end{aligned}$$

The computation of the third order impulse response proceeds along the same lines as the computation of \(h_2\). After some algebraic manipulations and exploiting the properties of the Laplace transform we obtain a rather long expression whose evaluation on the diagonal is

$$\begin{aligned} \mathrm {ev_{d}}(h_3)(t) = \left( \frac{c}{a}\right) ^2 \left( \textrm{e}^{-a t} - 2 \textrm{e}^{-2 a t} + \textrm{e}^{-3 a t}\right) \,. \end{aligned}$$

At this point it’s interesting to compare the first three elements of the approximate solution that we computed here with the exact solution that we calculated in Example 9.2 and that we reproduce here for convenience

$$\begin{aligned} y(t) = y_0 \frac{\textrm{e}^{-at}}{1 - y_0\frac{c}{a}(1 - \textrm{e}^{-at})}\,. \end{aligned}$$

If \(|y_0 c/a | < 1\) the exact solution can be expanded in a geometric power series

$$\begin{aligned} \begin{aligned} y(t) &= y_0 \textrm{e}^{-at} \sum _{j=0}^\infty \left[ \frac{y_0 c}{a}(1 - \textrm{e}^{-at})\right] ^j\\ &= y_0 \textrm{e}^{-at} + y_0^2 \frac{c}{a} \left( \textrm{e}^{-a t} - \textrm{e}^{-2 a t}\right) \\ &\quad + y_0^3 \left( \frac{c}{a}\right) ^2 \left( \textrm{e}^{-a t} - 2 \textrm{e}^{-2 a t} + \textrm{e}^{-3 a t}\right) + \cdots \\ &= \mathrm {ev_{d}}(h_1y_0 + h_2y_0^2 + h_3y_0^3)(t) + \cdots \end{aligned} \end{aligned}$$

and see that the lowest order terms correspond to the calculated response components \(y_1, y_2\) and \(y_3\). Note also that the convergence radius of the power series derived from the exact solution corresponds to the radius of the largest open ball, centered at the origin and contained in the domain of attraction of the equilibrium point

$$\begin{aligned} \mathbb {B}(0,a/c) :=\left\{ y_0 \in {\mathbb {R}}\ | \ y_0 < \frac{a}{c} \right\} \,. \end{aligned}$$

Figure 9.5 compares the exact solution of the initial value problem with the approximation given by \(\mathrm {ev_{d}}(y_1 + y_2 + y_3)\) for \(a=1, c=1/2, y_0=1\).

Fig. 9.5
figure 5

Comparison of the approximate solutions of the logistic differential equation in terms of \(y_1, y_2\) and \(y_3\) against the exact solution y(t) for \(a=1, c=1/2, y_0=1\)

While for this particular example it was easier to compute the exact solution than to calculate the approximation, the latter allows us to obtain the output of the system described by the differential equation

$$\begin{aligned} Dy + a y = x + c y^2 \end{aligned}$$
(9.22)

for any input signals \(x \in {\mathcal {D_+'}}({\mathbb {R}})\) maintaining the system withing the region of attraction of the equilibrium point

$$\begin{aligned} \mathrm {ev_{d}}(y)(t) \approx \mathrm {ev_{d}}(y_1 + y_2 + y_3)(t) \end{aligned}$$

with

$$\begin{aligned} y_1(t) &= \int \limits _0^t h_1(t - \tau _1) x(\tau _1) d\tau _1\\ \mathrm {ev_{d}}(y_2)(t) &= \int \limits _0^t\int \limits _0^t h_2(t - \tau _1, t - \tau _2) x(\tau _1) x(\tau _2) d\tau _1 d\tau _2\\ \mathrm {ev_{d}}(y_3)(t) &= \int \limits _0^t\int \limits _0^t\int \limits _0^t h_3(t - \tau _1, t - \tau _2, t - \tau _3) x(\tau _1) x(\tau _2) x(\tau _3) d\tau _1 d\tau _2 d\tau _3\,. \end{aligned}$$

Here and in many problems, this amounts to limiting the magnitude of the input signal to sufficiently small values. Figure 9.6 show the approximate solution for a sinusoidal input \(x(t) = \textsf{1}_{+}(t)\sin (t)\) and compares it to the solution obtained by numerical integration of the differential equation for \(a=1, c=1/2\).

Fig. 9.6
figure 6

Comparison of the approximate solutions of (9.22) with a sinusoidal input \(x(t) = \textsf{1}_{+}(t)\sin (t)\) in terms of the three lowest order response components \(y_1, y_2\) and \(y_3\) against the solution y(t) obtained by numerical integration for \(a=1, c=1/2\)

This example shows how by representing the solution of a nonlinear differential equation describing a weakly nonlinear system by a sequence of distributions \(y\in {\mathcal {D}}'_{\oplus ,\text {sym}}\) we have reduced the problem of solving a nonlinear differential equation to an essentially algebraic problem. While some expressions are rather long, they can be manipulated rather easily by modern computer algebra systems (CAS).

Example 9.6

We revisit Example 9.3and try to find an approximate solution in \({\mathcal {D}}'_{\oplus ,\text {sym}}\) of the initial value problem

$$\begin{aligned} Dy = c y^3\,, \qquad y(0) = y_0\,, \qquad c < 0 \end{aligned}$$

valid around its zero equilibrium point. Note that the linearized equation is stable, but not asymptotically stable.

As before we calculate the impulse responses by setting \(y_0 = 1\). The solution for an arbitrary \(y_0\) is then found by multiplying the kth order impulse response \(h_k\) by \(y_0^k\).

The first order impulse response \(h_1\) is found by writing the convolution equation corresponding to the above initial value problem and retaining only terms of first order

$$\begin{aligned} D\delta *h_1 = \delta \,. \end{aligned}$$

By Laplace transforming the equation, the first order transfer function \(H_1(s_1)\) is found to be

$$\begin{aligned} H_1(s_1) = \frac{1}{s_1}\,. \end{aligned}$$

From it, the first order impulse response is

$$\begin{aligned} h_1(\tau _1) = \textsf{1}_{+}(\tau _1)\,. \end{aligned}$$

The equation doesn’t have second order nonlinearities. Therefore the second order impulse response and the second order transfer function are both zero

$$\begin{aligned} h_2(\tau _1, \tau _2) = 0\,, \qquad H_2(s_1,s_2) = 0\,. \end{aligned}$$

The third order impulse response is found by retaining all third order terms in the convolution equation

$$\begin{aligned} D\delta *h_3 = c\,h_1^{\otimes 3}\,. \end{aligned}$$

By Laplace transforming the equation we find for the third order transfer function

$$\begin{aligned} H_3(s_1, s_2, s_3) = \frac{c}{(s_1 + s_2 + s_3) s_1 s_2 s_3}\,. \end{aligned}$$

From this, the third order impulse response is obtained by inverse Laplace transforming with respect to one variable at a time and by symmetrizing the result

$$\begin{aligned} h_3(\tau _1, \tau _2, \tau _3) =\, &c \Bigl [\tau _3 \textsf{1}_{+}(\tau _3) + (\tau _2 - \tau _3)\textsf{1}_{+}(\tau _3 - \tau _2)\Bigr .\\ & \Bigl . + \textsf{1}_{+}(\tau _2 - \tau _1) \bigl [ (\tau _1 - \tau _3)\textsf{1}_{+}(\tau _3 - \tau _1) + (\tau _3 - \tau _2)\textsf{1}_{+}(\tau _3 - \tau _2) \bigr ]\Bigr ]_{\text {sym}}\,. \end{aligned}$$

From the above results we could conclude that, to third order, the approximate solution of the initial value problem is

$$\begin{aligned} \mathrm {ev_{d}}(y)(t) = y_0 \textsf{1}_{+}(t) + c y_0^3 \textsf{1}_{+}(t) t + \cdots \,. \end{aligned}$$

This is however only valid for sufficiently small values of t. The reason is best seen by comparing the above expression with the exact solution of the initial value problem that we obtained in Example 9.3 and that we repeat here for convenience

$$\begin{aligned} y(t) = \frac{y_0}{\sqrt{1 - 2 c y_0^2 t}}\,. \end{aligned}$$

The Taylor expansion around zero of the function

$$\begin{aligned} x \mapsto \frac{1}{\sqrt{1 - x}} \end{aligned}$$

is

$$\begin{aligned} 1 + \frac{1}{2}x + \frac{1\cdot 3}{2\cdot 4}x^2 + \frac{1\cdot 3\cdot 5}{2\cdot 4\cdot 6}x^3 + \frac{1\cdot 3\cdot 5\cdot 7}{2\cdot 4\cdot 6\cdot 8}x^4 + \cdots \end{aligned}$$

and has a convergence radius of 1. Therefore, as long as \(|2cy_0^2 t | < 1\), the exact solution can be represented by the power series

$$\begin{aligned} y(t) = y_0 \left[ 1 + cy_0^2 t + \frac{3}{2}(cy_0^2 t)^2 + \frac{5}{2}(cy_0^2 t)^3 + \frac{35}{8}(cy_0^2 t)^4 + \cdots \right] \end{aligned}$$

whose first two terms coincide with \(y_0 h_1(t)\) and \(\mathrm {ev_{d}}(y_0^3 h_3)(t)\) respectively. However, as t increases, the higher order terms become more and more important and, when \(|2cy_0^2 t | = 1\), the Taylor expansion stops being a valid representation of the exact solution of the initial value problem.

The last example shows that, in general, the solution of a nonlinear differential equation in terms of an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) is only meaningful around an equilibrium point for which the linearized equation is asymptotically stable. The reason being that, if this is not the case then the response of the system to any part of the input signal can persist indefinitely in time without ever decreasing to negligible levels. Since this is true for the response of any order, the output \(\mathrm {ev_{d}}(y)\) can not in general be represented by a power series. We can say that systems that are representable by a Volterra series are those whose output does not depend on the too distant past.

In the case in which the linearized system is asymptotically stable all impulse responses are summable distributions. Their Fourier transforms are therefore continuous functions that can be obtained from the nonlinear transfer functions \(H_k\) by

$$\begin{aligned} \hat{h}_k(\omega _1,\ldots ,\omega _k) = H_k(\jmath \omega _1,\ldots ,\jmath \omega _k)\,. \end{aligned}$$

As the nonlinear transfer functions are rational functions, the Fourier transforms \(\hat{h}_k\) are indefinitely differentiable and of polynomial growth, so they belong to \({\mathcal {O}}_M\).

7 Periodic Input Signals

In this section we investigate the response of weakly nonlinear systems to periodic input signals. Given a periodic input signal x, every tensor power \(x^{\otimes k}\) is evidently also a (higher dimensional) periodic distribution. Therefore, every component \(y_k\) of the system response y can be calculated in the convolution algebra of periodic distributions and represented by a Fourier series.

Let x be a \({\mathcal {T}}\)-periodic input signal with Fourier coefficients

$$\begin{aligned} c_m(x) = \frac{1}{{\mathcal {T}}} \langle x,\textrm{e}^{-\jmath m \frac{2\pi }{{\mathcal {T}}} t} \rangle \,, \qquad m \in {\mathbb {Z}}\,. \end{aligned}$$

Further, let \(m = (m_1,\ldots ,m_k) \in {\mathbb {Z}}^k\) be a multi-index and \(\omega _c = 2\pi /{\mathcal {T}}\), then the Fourier coefficients of the kth tensor power of x are

$$\begin{aligned} \begin{aligned} c_m(x^{\otimes k}) &= \frac{1}{{\mathcal {T}}^k} \langle x^{\otimes k},\textrm{e}^{-\jmath \omega _c \left( m,\tau \right) } \rangle \\ &= \frac{1}{{\mathcal {T}}} \langle x,\textrm{e}^{-\jmath m_1 \omega _c \tau _1} \rangle \cdots \frac{1}{{\mathcal {T}}} \langle x,\textrm{e}^{-\jmath m_k \omega _c \tau _k} \rangle \\ &= c_{m_1}(x) \cdots c_{m_k}(x)\,. \end{aligned} \end{aligned}$$

With this expression and a straightforward generalization of Eqs. (4.21) and (4.24) to higher dimensional distributions, the Fourier coefficients of \(y_k\) are readily seen to be

$$\begin{aligned} c_m(y_k) = \hat{h}_k(m_1\omega _c,\ldots ,m_k\omega _c) \, c_{m_1}(x) \cdots c_{m_k}(x) \end{aligned}$$
(9.23)

with \(\hat{h}_k\) the Fourier transform of the kth order impulse response of the system.

8 Multi-tone Input Signals

In some applications, for example in the study of interference and distortion in communication systems, one is often interested in the response of a system to input signals consisting of sinusoidal tones. If the frequencies of the tones are commensurate, that is, if their ratios are rational numbers, then one can find a common period and the input signal is periodic. The system response can thus be obtained by using the results of the previous section. However, for multi-tone input signals the results are often more directly interpretable by using a different indexing scheme for the tones composing the output components \(y_k\) [13].

8.1 General Case

Let’s consider a system driven by an input consisting of N complex tones

$$\begin{aligned} x(t) = \sum _{n=1}^N A_n \chi _n(t), \quad \chi _n(t) :=\textrm{e}^{\jmath \omega _n t},\quad A_n :=|A_n | \textrm{e}^{\jmath \varphi _n} \end{aligned}$$

initially assumed to have commensurate angular frequencies \(\omega _1,\ldots ,\omega _N\). Our objective is to calculate the system response of order k

$$\begin{aligned} y_k = h_k *x^{\otimes k}\,. \end{aligned}$$

Consider first the tensor power \(x^{\otimes k}\). It can be expanded with the help of (9.10)

$$\begin{aligned} \begin{aligned} x^{\otimes k} &= \left( \sum _{n=1}^N A_n \chi _n\right) ^{\otimes k}\\ &= \sum _{|m |=k} \frac{k!}{m!} A_1^{m_1} \cdots A_N^{m_N} \cdot \left[ \chi _1^{\otimes m_1} \otimes \chi _N^{\otimes m_N}\right] _{\text {sym}} \end{aligned} \end{aligned}$$

with m the multi-index \(m=(m_1,\ldots ,m_N)\) whose elements range from 0 to N. Observe that this expression is the Fourier series representation of \(x^{\otimes k}\). With it and (9.23) the Fourier series representation of \(y_k\) is thus found to be

$$\begin{aligned} y_k &= \sum _{|m |=k} \frac{k!}{m!} A_1^{m_1} \cdots A_N^{m_N} \, \hat{h}_{k,m} \nonumber \cdot \left[ \chi _1^{\otimes m_1} \otimes \cdots \otimes \chi _N^{\otimes m_N}\right] _{\text {sym}} \nonumber \\ \hat{h}_{k,m} & :=\hat{h}_k(\underbrace{\omega _1,\ldots ,\omega _1}_{m_1},\ldots , \underbrace{\omega _N,\ldots ,\omega _N}_{m_N}) \end{aligned}$$
(9.24)

with \(\hat{h}_k\) the Fourier transform of the impulse response of order k. As this sum is finite and only composed by indefinitely differentiable functions, it is itself an indefinitely differentiable function that can be evaluated on the diagonal

$$\begin{aligned} y_{k}(t) & :=\mathrm {ev_{d}}(y_k)(t) = \sum _{|m |=k} y_{k,m}(t) \end{aligned}$$
(9.25)
$$\begin{aligned} y_{k,m}(t) & :=\frac{k!}{m!} A_1^{m_1} \cdots A_N^{m_N} \, \hat{h}_{k,m} \, \textrm{e}^{\jmath \omega _m t}\end{aligned}$$
(9.26)
$$\begin{aligned} \omega _m & :=\sum _{n = 1}^N m_n \, \omega _n = m_1\omega _1 + \cdots + m_N\omega _N\,. \end{aligned}$$
(9.27)

The kth order response of the system is therefore a sum composed by

$$\begin{aligned} \frac{(N-1 + k)!}{(N-1)! k!} \end{aligned}$$
(9.28)

complex tones, each one uniquely determined by a specific multi-index m. In this context the multi-index m is also called a frequency mix and \(|m |\) its order.

These results show several important properties of weakly nonlinear systems.

  • In contrast to linear systems, weakly nonlinear systems generate tones at frequencies not present at its input.

  • In general, tones at a specific frequency are generated by frequency mixes of various orders.

  • To fully characterize \(\hat{h}_k\) (and hence \(h_k\)) one needs k input tones.

At the beginning of this section we assumed the input frequencies to be commensurate. If this is not the case then the input signal is not periodic, but almost periodic. For such signals one can still define a Fourier series [16, Sect. VI.9] and the obtained results remains valid.

8.2 Real Case

In this section we specialize the above results to the case of a real system driven by an input consisting of N sinusoidal signals

$$\begin{aligned} x(t) = \sum _{n=1}^N |A_n | \cos (\omega _n t + \varphi _n) \end{aligned}$$

and where we assume \(\omega _1,\ldots ,\omega _N > 0\). To re-use previous results it’s convenient to represent the input signal in terms of complex exponentials and use separate indexes for positive and negative angular frequencies

$$\begin{aligned} & \qquad x(t) = \frac{1}{2} \sum _{n=1}^N A_n\chi _n(t) + A_{-n}\chi _{-n}(t), \qquad \chi _n(t) :=\textrm{e}^{\jmath \omega _n t}\\ & A_n :=|A_n | \textrm{e}^{\jmath \varphi _n}, \qquad A_{-n} :=\overline{A}_n = |A_n | \textrm{e}^{-\jmath \varphi _n},\qquad \omega _{-n} :=-\omega _n\,. \end{aligned}$$

The quantity \(A_n\) is called the phasor of the sinusoidal signal

$$\begin{aligned} |A_n |\cos (\omega _nt + \varphi _n)\,. \end{aligned}$$

With this notation and using the multi-index \(m=(m_{-N},\ldots ,m_{-1},m_1,\ldots ,m_N)\) the output component \(y_k\) is easily calculated with the help of (9.24)–(9.27)

$$\begin{aligned} y_{k}(t) & = \sum _{|m |=k} y_{k,m}(t) \\ y_{k,m}(t) & = \frac{1}{2^k} \frac{k!}{m!} A_{-N}^{m_{-N}} \cdots A_{-1}^{m_{-1}} A_1^{m_1} \cdots A_N^{m_N} \, \hat{h}_{k,m} \, \textrm{e}^{\jmath \omega _m t}\\ \hat{h}_{k,m} & = \hat{h}_k(\underbrace{\omega _{-N},\ldots ,\omega _{-N}}_{m_{-N}},\ldots , \underbrace{\omega _N,\ldots ,\omega _N}_{m_N})\\ \omega _m & = \sum _{\begin{array}{c} n = -N \\ n \ne 0 \end{array}}^N m_n \, \omega _n = (m_1 - m_{-1})\omega _1 + \cdots + (m_N - m_{-N})\omega _N\,. \end{aligned}$$

To N sinusoidal input tones there correspond 2N complex tones. Therefore, in the real case, the sum is composed by

$$\begin{aligned} \frac{(2N-1 + k)!}{(2N-1)! k!} \end{aligned}$$
(9.29)

frequency mixes.

In the real case there is some extra structure that we can exploit. Consider a specific frequency mix \(m=(m_{-N},\ldots ,m_{-1},m_1,\ldots ,m_N)\). From the above expression, it’s apparent that the multi-index 

$$\begin{aligned} \textrm{rv}(m) :=(m_N,\ldots ,m_1,m_{-1},\ldots ,m_{-N}) \end{aligned}$$
(9.30)

obtained from m by reversing the order of the entries does also appear in the Fourier series of \(y_k\). If \(m \ne \textrm{rv}(m)\) then from \(k!/\textrm{rv}(m)! = k!/m!\), \(\omega _{\textrm{rv}(m)} = -\omega _m\), \(A_{\textrm{rv}(m)} = \overline{A}_m\) and \(\hat{h}_{k,\textrm{rv}(m)} = \overline{\hat{h}}_{k,m}\) we deduce that the sum of \(y_{k,m}(t)\) and \(y_{k,\textrm{rv}(m)}(t)\) is a sinusoidal signal

$$\begin{aligned} \begin{aligned} y_{k,m}^c(t) & :=y_{k,m}(t) + y_{k,\textrm{rv}(m)}(t) = 2\Re \{y_{k,m}\} \\ &= \frac{1}{2^{k-1}} \frac{k!}{m!} |A_1 |^{m_1 + m_{-1}} \cdots |A_N |^{m_N + m_{-N}} \, |\hat{h}_{k,m} | \,\\ &\qquad \cdot \cos (\omega _m t + \varphi _m + \psi _{k,m}) \end{aligned} \end{aligned}$$
(9.31)

with

$$\begin{aligned} \hat{h}_{k,m} &= |\hat{h}_{k,m} | \, \textrm{e}^{\jmath \psi _{k,m}} \end{aligned}$$
(9.32)
$$\begin{aligned} \varphi _m &= \sum _{\begin{array}{c} n = -N \\ n \ne 0 \end{array}}^N m_n \, \varphi _n = (m_1 - m_{-1})\varphi _1 + \cdots + (m_N - m_{-N})\varphi _N\,. \end{aligned}$$
(9.33)

If \(m = \textrm{rv}(m)\) then the multi-index \(\textrm{rv}(m)\) is not distinct from m and the Fourier series component described by \(\textrm{rv}(m)\) coincides with the one described by m. In this case \(\omega _m = 0\) and, as the system is assumed to be real, \(\hat{h}_{k,m}\) must be real. The response \(y_{k,m}\) therefore becomes

$$\begin{aligned} y_{k,m}(t) = \frac{1}{2^{k}} \frac{k!}{m!} |A_1 |^{2 m_1} \cdots |A_N |^{2 m_N} \, \hat{h}_{k,m}\,, \qquad m = \textrm{rv}(m) \end{aligned}$$
(9.34)

Note that m and \(\textrm{rv}(m)\) can only be equal for even values of k. Also, note that there can be multi-indexes m resulting in \(\omega _m = 0\) for which \(m = \textrm{rv}(m)\) doesn’t hold.

Example 9.7

Consider again the system described by the differential equation

$$\begin{aligned} Dy + a y = x + c y^2 \qquad a,c > 0\,. \end{aligned}$$

that we analysed in Example 9.5. Here we are interested in the steady state response of the system when driven by the input signal

$$\begin{aligned} x(t) = |A |\sin (\omega _1 t) = |A |\cos (\omega _1 t - \pi /2). \end{aligned}$$

In our previous analysis of this system we calculated the first three nonlinear transfer functions \(H_1, H_2\) and \(H_3\). Using those results, the output components \(y_1, y_2\) and \(y_3\) are immediately obtained from (9.31) and (9.34) without having to calculate any inverse Laplace transform.

Concretely, as the input signal consists of a single sinusoidal tone, the frequency mixes are composed by two entries \(m=(m_{-1}, m_1)\). The output of first order \(y_1\) is obtained from the above equations by setting \(k=1\) and by summing over all multi-indexes satisfying the constraint \(|m| = m_{-1}+m_1 = 1\). There are only two such multi-indexes: (0, 1) and \(\textrm{rv}((0,1)) = (1,0)\). The first order output of the system is therefore given by

$$\begin{aligned} y_1(t) = \Re \left\{ H_1(\jmath \omega _1) A \textrm{e}^{\jmath \omega _1 t} \right\} \end{aligned}$$

with \(A = |A |\textrm{e}^{-\jmath \pi /2}\).

The second order response of the system \(y_2\) is obtained by setting \(k=2\) and summing over all multi-indexes under the constraint \(|m|=2\). There are three of them: (2, 0), (0, 2) and (1, 1). The first one is the reverse of the second one. Therefore, the contribution of these two is obtained from (9.31)

$$\begin{aligned} y_{2,(0,2)}^c(t) = \frac{1}{2} \Re \left\{ H_2(\jmath \omega _1, \jmath \omega _1) A^2 \textrm{e}^{\jmath 2\omega _1 t} \right\} \,. \end{aligned}$$

Since the remaining multi-index is equal to its reverse \((1,1) = \textrm{rv}((1,1))\), its contribution is the constant given by (9.34)

$$\begin{aligned} y_{2,(1,1)} = |A |^2 H_2(-\jmath \omega _1,\jmath \omega _1)\,. \end{aligned}$$

The response of second order is thus

$$\begin{aligned} y_2(t) = y_{2,(0,2)}^c(t) + y_{2,(1,1)}. \end{aligned}$$

The third order response of the system \(y_3\) is obtained by setting \(k=3\) and summing over all multi-indexes for which \(|m|=3\). There are four of them: (3, 0), (2, 1), (1, 2) and (0, 3). Two of them are the reverse of the other two. For this reason the response of third order of the system \(y_3\) is given by

$$\begin{aligned} y_3(t) = y_{3,(0,3)}^c(t) + y_{3,(1,2)}^c(t) \end{aligned}$$

with

$$\begin{aligned} y_{3,(0,3)}^c(t) &= \frac{1}{4} \Re \left\{ H_3(\jmath \omega _1, \jmath \omega _1, \jmath \omega _1) A^3 \textrm{e}^{\jmath 3\omega _1 t} \right\} \\ y_{3,(1,2)}^c(t) &= \frac{3}{4} \Re \left\{ H_3(-\jmath \omega _1, \jmath \omega _1, \jmath \omega _1) |A |^2\,A \textrm{e}^{\jmath \omega _1 t} \right\} \,. \end{aligned}$$

Example 9.8: Two Tones Input

Suppose that we would like to implement a causal real LTI system. However, due to unavoidable limitations of physical components, the implementation behaves as a real weakly nonlinear system characterized by the nonlinear transfer functions \(H_k\) (see Fig. 9.4). We are interested in its output when driven by an input signal consisting in two sinusoidal tones

$$\begin{aligned} x(t) = |A_1 |\cos (\omega _1 t + \varphi _1) + |A_2 |\cos (\omega _2 t + \varphi _2)\,. \end{aligned}$$

We think of the two tones as closely spaced in frequency and denote the difference of their angular frequencies by \(\Delta \omega = \omega _2 - \omega _1\).

As the input is composed by two sinusoidal signals, the frequency mixes have four entries \(m = (m_{-2}, m_{-1}, m_1, m_2)\). From (9.29) we calculate that there are 4, 10 and 20 frequency mixes of order one, two and three, respectively. They are listed in Table 9.1.

Table 9.1 Frequency mixes generated by the first, second and third order nonlinearities of a weakly nonlinear system driven by two sinusoidal tones

The first order output \(y_1\) is the output that would be produced by a perfectly linear system. All other tones are undesired. In particular, while tones relatively distant in frequency from \(\omega _1\) and \(\omega _2\) are relatively easily suppressed with filters, tones close to them are much more difficult to filter out. The tones closest in frequency to \(\omega _1\) and \(\omega _2\) listed in Table 9.1 are the tones associated with the frequency mixes (1, 0, 2, 0), (0, 1, 0, 2) end their reverses

$$\begin{aligned} y_{3,(1,0,2,0)}^c(t) = \frac{3}{4} \Re \left\{ \overline{A_2} \, A_1^2 \, H_3(-\jmath \omega _2,\jmath \omega _1,\jmath \omega _1) \textrm{e}^{\jmath (\omega _1- \Delta \omega ) t} \right\} \end{aligned}$$

and

$$\begin{aligned} y_{3,(0,1,0,2)}^c(t) = \frac{3}{4} \Re \left\{ \overline{A_1} \, A_2^2 \, H_3(-\jmath \omega _1,\jmath \omega _2,\jmath \omega _2) \textrm{e}^{\jmath (\omega _2 + \Delta \omega ) t} \right\} \end{aligned}$$

both produced by nonlinearities of third order.

The frequency mixes of fifth order are 56. Among them we can easily identify frequency mixes producing tones at every frequency generated by third order nonlinearities, in particular at \(\omega _1-\Delta \omega = 2\omega _1 - \omega _2\). To see this, start with a frequency mix m producing the frequency of interest and add the same number \(l > 0\) to \(m_n\) and \(m_{-n}\) for any n ranging from 1 to N (the number of input sinusoidal tones, here 2)

$$\begin{aligned} m' = (m_{-N},\ldots ,m_{-n}+l,\ldots ,m_n+l,\ldots ,m_N)\,. \end{aligned}$$

Then the order of the new frequency mix \(m'\) is 2l higher than the one of m and the angular frequencies \(\omega _m\) and \(\omega _{m'}\) associated with the two frequency mixes are identical (see (9.27)).

Using this construction starting from (1, 0, 2, 0), we see that the fifth order mixes (2, 0, 2, 1), (1, 1, 3, 0) and their reverses produce tones at \(\omega _1-\Delta \omega \)

$$\begin{aligned} y_{5,(2,0,2,1)}^c(t) &= \frac{15}{8} \Re \left\{ \overline{A_2}^2 \, A_1^2 \, A_2 H_5(-\jmath \omega _2,-\jmath \omega _2,\jmath \omega _1,\jmath \omega _1,\jmath \omega _2) \textrm{e}^{\jmath (\omega _1 - \Delta \omega ) t} \right\} \\ y_{5,(1,1,3,0)}^c(t) &= \frac{5}{4} \Re \left\{ \overline{A_2} \, \overline{A_1} \, A_1^3 \, H_5(-\jmath \omega _2,-\jmath \omega _1,\jmath \omega _1,\jmath \omega _1,\jmath \omega _1) \textrm{e}^{\jmath (\omega _1 - \Delta \omega ) t} \right\} \,. \end{aligned}$$

The total response of the system at the frequency \(\omega _1 - \Delta \omega \) is therefore a possibly infinite sum composed by the above mixes and higher order ones

$$\begin{aligned} y^c_{3,(1,0,2,0)} + y^c_{5,(2,0,2,1)} + y^c_{5,(1,1,3,0)} + \cdots \,. \end{aligned}$$

This sum can be represented graphically by drawing the phasor of each summand as a vector in the complex plane and summing them by vector addition. Figure 9.7 shows the phasor diagram for the above sum under the assumption that summands of order higher than fifth can be neglected.

Fig. 9.7
figure 7

Hypothetical phasor diagram for the response at \(\omega _1 - \Delta \omega \) under the assumption that components of order higher than fifth are negligible

Observe that summands of different order depend differently on the amplitude of the input signals \(|A_1 |\) and \(|A_2 |\). For small input amplitudes the third order one is usually the dominant. As the amplitude of the input tones grows, higher order summands become first significant and then dominant. This means that both the magnitude as well as the phase of the output tone does change with the amplitude of the input signals. At some level of the input tones there may even be a canceling effect where the output tone becomes very small.

Fig. 9.8
figure 8

Positive part of a typical magnitude output spectrum of a weakly nonlinear system driven by two sinusoidal input tones. The number q above each spectral line indicates the lowest order nonlinearity generating the line. The same line is also generated by every nonlinearity of order \(q + 2l, l \in {\mathbb {N}}\). Only lines generated by fifth or lower order nonlinearities are shown

Among the 56 frequency mixes of fifth order there are several of them generating tones at new frequencies. In particular the closest in frequency to \(\omega _1\) and \(\omega _2\) (not generated by lower order mixes) are at \(\omega _1 - 2\Delta \omega \) and \(\omega _2 + 2\Delta \omega \). Similarly, higher odd order frequency mixes introduce tones at new frequencies spaced by \(\Delta \omega \) from the previous ones. Figure 9.8 illustrates a typical spectrum of the output signal. For simplicity of representation the figure only shows lines generated by fifth or lower order nonlinearities.