1 Cascade of Noninteracting Systems

When building large systems, it’s common to construct them by combining smaller subsystems. To gain the ability to investigate such systems, in this section we study the fundamental operation of cascading two systems, that is, of connecting the output of one system to the input of another one. In our treatment we are going to assume that this connection doesn’t change the behavior of the involved systems. This is not always the case. Therefore, before applying what follows, we must carefully ponder this aspect.

Consider the cascade of the weakly nonlinear systems \({\mathcal {G}}\) and \({\mathcal {H}}\) as shown in Fig. 10.1. Both systems are characterised by their respective impulse responses \(g_k\) and \(h_k\) that we assume to belong to a convolution algebra \({\mathcal {A}}'_{\oplus ,\text {sym}}\). We are looking for an expression to represent

$$\begin{aligned} y = (h \circ g) [x] :=h[g[x]]\,, \end{aligned}$$

the composition of \({\mathcal {H}}\) after \({\mathcal {G}}\) that we denote by \({\mathcal {H}} \circ {\mathcal {G}}\).

Let’s first consider the system \({\mathcal {G}}\). It’s output z when driven by the one dimensional distribution \(x_1\) is

$$\begin{aligned} z = \sum _{k=1}^\infty g_k *x_1^{\otimes k}\,. \end{aligned}$$

If instead of representing the input signal by a one dimensional distribution \(x_1\), we represent it by a sequence \(x = (0, x_1, 0, \ldots ) \in {\mathcal {A}}'_{\oplus ,\text {sym}}\) with all its components but \(x_1\) equal to zero and use the product that we defined on \({\mathcal {D}}'_{\oplus ,\text {sym}}\), then we can express z in the equivalent form

$$\begin{aligned} z = \sum _{k=1}^\infty g_k *x^k\,. \end{aligned}$$

The obtained expression is even more reminiscent of a power series than the original one and, more importantly, it is more amenable to generalisation. In fact, if we assume that this expression remains valid for arbitrary input signals belonging to \({\mathcal {A}}'_{\oplus ,\text {sym}}\) then the same expression can be used to describe the output of \({\mathcal {H}}\) in terms of z

$$\begin{aligned} y = \sum _{k=1}^\infty h_k *z^k\,. \end{aligned}$$

We can then define the compositionof weakly nonlinear systems by

$$\begin{aligned} \begin{aligned} (h \circ g)[x_1] &:=\sum _{k=1}^\infty (h \circ g)_k *x_1^{\otimes k} :=\sum _{k=1}^\infty h_k *z^k\\ &= h_1 *(g_1 *x_1 + g_2 *x_1^{\otimes 2} + \cdots )\\ &\quad + h_2 *(g_1 *x_1 + g_2 *x_1^{\otimes 2} + \cdots )^2\\ &\quad + h_3 *(g_1 *x_1 + g_2 *x_1^{\otimes 2} + \cdots )^3\\ &\quad + \cdots \end{aligned} \end{aligned}$$
(10.1)

with \((h \circ g)_k\) denoting the kth order impulse response of the overall system and consisting of all terms of dimension k. Note that, for every value of k, there are only a finite number of them as the lowest tensor power of \(x_1\) appearing in \(z^n\) is the nth one and thus

$$\begin{aligned} (z^n)_k = 0 \qquad \text {for} \quad n > k\,. \end{aligned}$$

The first five components are listed in Table 10.1 for easy reference. Note, here as well, the analogy with power series and their composition [25].

Table 10.1 Lowest order impulse responses of the composite system \(h \circ g\) in terms of the impulse responses of h and g
Fig. 10.1
figure 1

Cascade of the system \({\mathcal {G}}\) with \({\mathcal {H}}\)

The above definition by itself is not complete as the convolution between distributions of different dimensions is not defined. To complete the definition we have thus to give a meaning to all undefined convolutions appearing in the expression for \((h \circ g)\).

Table 10.2 Convolutions between impulse responses of different order appearing in the composition of weakly nonlinear systems and their definition. They are grouped by the resulting order, from second to fifth. To simplify the notation the symmetrization operation is not explicitly shown

Let’s consider the convolutions appearing in \((h \circ g)_k\). The undefined ones are the ones involving \(h_l\) with \(l < k\). The first thing to note is that, for every l, all of them are convolution products between \(h_l\) and a distribution that is the tensor product of l distributions. In addition, by definition, the sum of the dimensions of these l distributions must be k. The convolution products that we have to define have therefore all the form

$$\begin{aligned} h_l *\left[ g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}\right] _{\text {sym}}\,, \qquad \alpha _i \in {\mathbb {N}}\end{aligned}$$
(10.2)

with

$$\begin{aligned} \sum _{i=1}^{k-l+1} i\,\alpha _i = k, \qquad \sum _{i=1}^{k-l+1} \alpha _i = l. \end{aligned}$$
(10.3)

The simplest case is the one for \(l=1\)

$$\begin{aligned} h_1 *g_k \end{aligned}$$

which represents a nonlinearity of order k, \(g_k\), followed by a linear system \(h_1\). To find a suitable general definition for this convolution we let us guide by regular distributions belonging to \(L_1\), evaluated on the diagonal. Setting \(k=2\) for simplicity we have

$$\begin{aligned} \begin{aligned} \mathrm {ev_{d}}(z_2)(t) &= \mathrm {ev_{d}}(g_2 *x_1^{\otimes 2})(t) \\ &= \int \limits _0^\infty \int \limits _0^\infty g_2(\lambda _1,\lambda _2) x_1(t - \lambda _1) x_1(t - \lambda _2) d\lambda _1 d\lambda _2\,. \end{aligned} \end{aligned}$$

Using this expression as the input of \(h_1\) we obtain

$$\begin{aligned} \mathrm {ev_{d}}(y_2)(t) &= \int \limits _0^\infty h_1(\tau _1) \mathrm {ev_{d}}(z_2)(t - \tau _1) \,d\tau _1\\ &= \int \limits _0^\infty h_1(\tau _1) \int \limits _0^\infty \! \int \limits _0^\infty g_2(\lambda _1,\lambda _2)\\ &\qquad \cdot x_1(t - \lambda _1 - \tau _1) x_1(t - \lambda _2 - \tau _1) \,d\lambda _1 d\lambda _2 d\tau _1\\ &= \int \limits _0^\infty \! \int \limits _0^\infty \! \int \limits _0^\infty h_1(\tau _1) g_2(\lambda _1 - \tau _1,\lambda _2 - \tau _1) \,d\tau _1\\ &\qquad \cdot x_1(t - \lambda _1) x_1(t - \lambda _2) \,d\lambda _1 d\lambda _2\,. \end{aligned}$$

Note that the innermost integral in the last expression is a convolution integral between \(h_1\) and \(g_2\). It can be generalised to arbitrary distributions by building the tensor product of \(h_1(\tau _1)\) with \(\delta (\tau _2 - \tau _1)\), the Dirac delta distribution in \(\tau _2\) parameterised (shifted) by \(\tau _1\)

$$\begin{aligned} & \left\langle \left[ h_1(\tau _1) \otimes \delta (\tau _2 - \tau _1) \right] *g_2(\tau _1, \tau _2),\phi (\tau _1, \tau _2) \right\rangle \\ &\qquad = \left\langle h_1(\tau _1) \otimes \delta (\tau _2 - \tau _1) \otimes g_2(\lambda _1, \lambda _2),\phi (\tau _1+\lambda _1, \tau _2+\lambda _2) \right\rangle \\ &\qquad = \left\langle h_1(\tau _1) \otimes g_2(\lambda _1, \lambda _2),\left\langle \delta (\tau _2 - \tau _1),\phi (\tau _1+\lambda _1, \tau _2+\lambda _2) \right\rangle \right\rangle \\ &\qquad = \left\langle h_1(\tau _1) \otimes g_2(\lambda _1, \lambda _2),\phi (\tau _1+\lambda _1, \tau _1+\lambda _2) \right\rangle \,. \end{aligned}$$

The above derivation generalises without any difficulty to the convolution between \(h_1\) and the impulse response of order k of \({\mathcal {G}}\). Taking into account that impulse responses have to be symmetric, we thus define the convolution between \(h_1\) and \(g_k\) by

$$\begin{aligned} (h_1 *g_k)(\tau _1,\ldots , \tau _k) :=\left[ h_1(\tau _1) \otimes \delta (\tau _2 - \tau _1,\ldots ,\tau _k - \tau _1)\right] _{\text {sym}} *g_k(\tau _1,\ldots , \tau _k)\,. \end{aligned}$$

In other words, to convolve \(h_1\) with a distribution of dimension k we promote \(h_1\) to a distribution of k dimensions by building the indicated tensor product and use the standard definition of convolution.

The Laplace transformed of \(h_1 *g_k\) has a very simple representation and leads to an easy interpretation. With

$$\begin{aligned} & \left\langle h_1(\tau _1) \otimes \delta (\tau _2 - \tau _1, \ldots , \tau _k - \tau _1),\mathrm{{e}}^{-s_1\tau _1 - \cdots - s_k\tau _k} \right\rangle \\ &\qquad \,\, = \left\langle h_1(\tau _1),\left\langle \delta (\tau _2 - \tau _1, \ldots , \tau _k - \tau _1),\mathrm{{e}}^{-s_1\tau _1 - \cdots - s_k\tau _k} \right\rangle \right\rangle \\ &\qquad \,\, = \left\langle h_1(\tau _1),\mathrm{{e}}^{-(s_1 + \cdots + s_k)\tau _1} \right\rangle \end{aligned}$$

we find

$$\begin{aligned} {\mathcal {L}}\{h_1 *g_k\}(s_1, \ldots , s_k) = H_1(s_1 + \cdots + s_k) \, G_k(s_1, \ldots , s_k)\,. \end{aligned}$$

Therefore, if the input signal \(x_1\) consists of N tones, the nonlinear system component \(g_k\) generates new tones at frequencies that are linear combinations of k of the input frequencies at a time (see (9.25)). The linear system \(h_1\) following it simply filters these newly generated tones as prescribed by its transfer function \(H_1\), in accordance with expectation.

Consider next the next simplest undefined convolution

$$\begin{aligned} h_2 *\left[ g_1 \otimes g_2\right] _{\text {sym}}\,. \end{aligned}$$

As for the previous case, we look for a way to promote \(h_2\) to a distribution of dimension \(k=3\) so that we can use the standard definition of convolution. We do so by working with multi-tone input signals as this leads to easier interpretations.

Let \(g_1 \otimes g_2\) be driven by 3 unit tones

$$\begin{aligned} x_1(t) = \mathrm{{e}}^{\jmath \omega _1 t} + \mathrm{{e}}^{\jmath \omega _2 t} + \mathrm{{e}}^{\jmath \omega _3 t}\,, \end{aligned}$$

then its output is

$$\begin{aligned} & \left[ g_1\otimes g_2\right] _{\text {sym}} *x_1^{\otimes 3} \\ &\qquad \qquad \quad = \sum _{n_1=1}^3 \sum _{n_2=1}^3 \sum _{n_3=1}^3 G_1(\jmath \omega _{n_1}) G_2(\jmath \omega _{n_2}, \jmath \omega _{n_3}) \mathrm{{e}}^{\jmath (\omega _{n_1}\tau _1 + \omega _{n_2}\tau _2 + \omega _{n_3}\tau _3)} \\ & \qquad \qquad \quad = \sum _{n_1=1}^3 \sum _{n_2=1}^3 \sum _{n_3=1}^3 G_1(\jmath \omega _{n_1})\mathrm{{e}}^{\jmath \omega _{n_1}\tau _1} \, G_2(\jmath \omega _{n_2}, \jmath \omega _{n_3}) \mathrm{{e}}^{\jmath (\omega _{n_2}\tau _2 + \omega _{n_3}\tau _3)} \end{aligned}$$

with \(G_1(s_1)G_2(s_2,s_3)\) the Laplace transform of \(g_1 \otimes g_2\). This expression suggests that \(g_1 \otimes g_2\) can be interpreted as the parallel combination of a linear system and a second order one. For each term of the sum, the tone at \(\omega _{n_1}\) passes through the linear system \(g_1\) while the other two pass through \(g_2\). The output \(\mathrm {ev_{d}}(g_1 \otimes g_2)(t)\) can thus be considered as consisting of a sum of pairs of tones, one at \(\omega _{n_1}\) and the other at \(\omega _{n_2} + \omega _{n_3}\). This sum of tones couples constitute the input of \(h_2\) which processes them and, for each tone couple generates the signal

$$\begin{aligned} H_2(\jmath \omega _{n_1}, \jmath \omega _{n_2} + \jmath \omega _{n_3}) G_1(\jmath \omega _{n_1}) G_2(\jmath \omega _{n_2}, \jmath \omega _{n_3}) \mathrm{{e}}^{\jmath (\omega _{n_1} + \omega _{n_2} + \omega _{n_3}) t}\,. \end{aligned}$$

Given these considerations, we define the convolution between \(h_2\) and \(\left[ g_1 \otimes g_2\right] _{\text {sym}}\) by

$$\begin{aligned} h_2 *\left[ g_1 \otimes g_2\right] _{\text {sym}} :=\left[ (h_2(\tau _1, \tau _2) \otimes \delta (\tau _3 - \tau _2)) *(g_1(\tau _1) \otimes g_2(\tau _2, \tau _3))\right] _{\text {sym}}\,. \end{aligned}$$

Its Laplace transform is

$$\begin{aligned} {\mathcal {L}}\{h_2 *\left[ g_1 \otimes g_2\right] _{\text {sym}}\}(s_1, s_2, s_3) = \left[ H_2(s_1, s_2 + s_3) G_1(s_1) G_2(s_2, s_3)\right] _{\text {sym}}\,. \end{aligned}$$

The above considerations can be extended to the general case (10.2). The tensor product of l distributions

$$\begin{aligned} g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}\,, \qquad \sum _{i=1}^{k-l+1} \alpha _i = l \end{aligned}$$

can be thought of as a set of l parallel subsystems of order lower than k. The constraints (10.3) make sure that with k input tones, its output can be made to consists of l tones at linear combinations of the original input frequencies. These can then be passed as input to \(h_l\).

The intended meaning of the generalised convolution expressed by (10.2) can thus be captured by promoting \(h_l\) to a k dimensional distribution obtained by building the tensor product of \(h_l\) and \(k-l\) appropriately shifted \(\delta \) distributions constructed as follows.

  • The first independent variable of each of the l distributions

    $$\begin{aligned} g_{m_1}(\tau _1,\dotsc ,\tau _{m_1}) \otimes \cdots \otimes g_{m_j}(\tau _{n+1},\dotsc ,\tau _{n+m_j}) \otimes \cdots \otimes g_{m_l}(\tau _{k-m_l+1},\dotsc ,\tau _k) \end{aligned}$$

    form the list of independent variables of \(h_l\)

    $$\begin{aligned} h_l(\tau _1,\dotsc ,\tau _{n+1},\dotsc ,\tau _{k-m_l+1})\,. \end{aligned}$$
  • For each additional variable of \(g_{m_j}\), \(m_j>1\), we tensor-multiply \(h_l\) by a Dirac distribution in this same variable, shifted by the first one

    $$\begin{aligned} \delta (\tau _{n+2}-\tau _{n+1}) \otimes \cdots \otimes \delta (\tau _{n+m_j}-\tau _{n+1})\,. \end{aligned}$$
  • The resulting k dimensional distribution has finally to be symmetrized.

A few convolution examples are given in Table 10.2. The Laplace transformed of these examples are tabulated in Table 10.3. With this definition we have completed the description of how to compose weakly nonlinear systems.

Table 10.3 Convolutions between impulse responses of different order appearing in the composition of weakly nonlinear systems and their Laplace transforms. They are grouped by the resulting order, from second to fifth. To simplify the notation the symmetrization operation is not explicitly shown

Example 10.1: Third-Order Nonlinearity

The third order nonlinearity of \({\mathcal {H}} \circ {\mathcal {G}}\) is generated in three distinct ways: First, by the nonlinearity of third order of \({\mathcal {H}}\) applied to the output of the linear part of \({\mathcal {G}}\)

$$\begin{aligned} H_3(s_1,s_2,s_3) G_1(s_1) G_1(s_2) G_1(s_2) X_1(s_1) X_1(s_2) X_1(s_3)\,, \end{aligned}$$

second, by the nonlinearity of third order of \({\mathcal {G}}\) passing through the linear part of \({\mathcal {H}}\)

$$\begin{aligned} H_1(s_1 + s_2 + s_3) G_3(s_1, s_2, s_2) X_1(s_1) X_1(s_2) X_1(s_3) \end{aligned}$$

and third, by the second order nonlinearity of \({\mathcal {H}}\) applied to the output of first and second order of \({\mathcal {G}}\)

$$\begin{aligned} 2 \, \left[ H_2(s_1, s_2 + s_3) G_1(s_1)G_2(s_2,s_3)\right] _{\text {sym}}\,. \end{aligned}$$
Fig. 10.2
figure 2

Graphical representation of the third order nonlinearity generated by the composition \({\mathcal {H}} \circ {\mathcal {G}}\). Each path has to be understood as symmetrised

These mechanisms are represented graphically in Fig. 10.2. In particular one should note that, even if neither \({\mathcal {G}}\) nor \({\mathcal {H}}\) shows nonlinearities of third order, the combined system \({\mathcal {H}} \circ {\mathcal {G}}\) in general still has an impulse response of third order different from zero.

Example 10.2: Memory-Less Systems

Consider the convolution(10.2) with \(h_l\) a Dirac distribution of dimension \(l < k\)

$$\begin{aligned} \delta ^{\otimes l} *\left[ g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}\right] _{\text {sym}}\,. \end{aligned}$$

By definition, the lower dimensional distribution \(\delta ^{\otimes l}\) is promoted to a distribution of dimension k by building the tensor product with shifted Dirac distributions as explained. For simplicity, we denote the promoted distribution by \(h_k\). Application of the convolution to a test function \(\phi \in {\mathcal {D}}({\mathbb {R}}^k)\) is defined by

$$\begin{aligned} & \left\langle h_k(\tau ) \otimes \left[ g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}(\lambda )\right] _{\text {sym}},\phi (\tau + \lambda ) \right\rangle \\ &\qquad \qquad \qquad \qquad \qquad \qquad = \left\langle \left[ g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}(\lambda )\right] _{\text {sym}},\left\langle h_k(\tau ),\phi (\tau + \lambda ) \right\rangle \right\rangle \end{aligned}$$

with \(\tau , \lambda \in {\mathbb {R}}^k\). The inner distribution is easily evaluated

$$\begin{aligned} \left\langle h_k(\tau ),\phi (\tau + \lambda ) \right\rangle = \phi (\lambda ) \end{aligned}$$

and from this we conclude that for any \(l \le k\)

$$\begin{aligned} \delta ^{\otimes l} *\left[ g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}\right] _{\text {sym}} = \left[ g_1^{\otimes \alpha _1} \otimes \cdots \otimes g_{k-l+1}^{\otimes \alpha _{k-l+1}}\right] _{\text {sym}}\,. \end{aligned}$$

With this result we see that the response of a memoryless weakly-nonlinear system can be written in the following equivalent forms

$$\begin{aligned} y = \sum _{k=1}^\infty c_k x^k = \sum _{k=1}^\infty c_k\delta ^{\otimes l_k} *x^k\,, \qquad l_k \le k\,. \end{aligned}$$
(10.4)

In general we will use \(l_k = k\) so that, if the input signal is a one dimensional distribution \(x_1\), we do not need to use the extended definition of convolution.

Our definition of convolution between distributions of different dimensions and our definition of the one dimensional differential operator operating on higher dimensional distributions (9.14) are compatible. In fact the former is a generalization of the latter. Consider the differential operator acting on the k dimensional Dirac distribution \(\delta ^{\otimes k}\). Application to a test function \(\phi \in {\mathcal {D}}({\mathbb {R}}^k)\) results in

$$\begin{aligned} \left\langle D\delta ^{\otimes k},\phi \right\rangle &= \left\langle \sum _{j=1}^kD_j\delta ^{\otimes k},\phi \right\rangle = -\left\langle \delta ^{\otimes k},\sum _{j=1}^kD_j\phi \right\rangle \\ &= -\sum _{j=1}^kD_j\phi (0,\dotsc ,0)\,. \end{aligned}$$

If we now consider \(\phi \) as a function of the variable \(\tau _1\) only and \(D_{\tau _1}\) the total differential operator, then we can write

$$\begin{aligned} -\sum _{j=1}^kD_j\phi (0,\dotsc ,0) &= -\left\langle \delta (\tau _1),D_{\tau _1}\phi (\tau _1,\dotsc ,\tau _1) \right\rangle \\ &= \left\langle D_{\tau _1}\delta (\tau _1),\phi (\tau _1,\dotsc ,\tau _1) \right\rangle \\ &= \left\langle (D\delta ) *\delta ^{\otimes k},\phi \right\rangle \end{aligned}$$

which shows that our definition of the differential operator acting on a higher dimensional distribution is equal to the convolution of the one dimensional distribution \(D\delta \) promoted by our definition of convolution to a k dimensional distribution

$$\begin{aligned} D\delta ^{\otimes k} = (D\delta ) *\delta ^{\otimes k}\,. \end{aligned}$$
(10.5)

This is also apparent from the Laplace transformed that in both cases are equal to

$$\begin{aligned} s_1 + \cdots + s_k\,. \end{aligned}$$

The differential operator and the extended definition of convolution do satisfy (3.15). We show this by way of an example. Consider the convolution between \(h_2\) and \(g_1 \otimes g_2\). Suppose further that \(h_2\) is the derivative in the sense of (9.14) of another distribution \(w_2\)

$$\begin{aligned} h_2 = Dw_2\,. \end{aligned}$$

Applying the convolution product to a test function \(\phi \in {\mathcal {D}}({\mathbb {R}}^3)\) and using the extended definition of convolution we obtain

$$\begin{aligned} & \left\langle h_2 *(g_1 \otimes g_2),\phi \right\rangle = \left\langle Dw_2 *(g_1 \otimes g_2),\phi \right\rangle \\ &\quad \quad \,\, = \left\langle \{[Dw_2(\tau _1, \tau _2)] \otimes \delta (\tau _3 - \tau _2)\} \otimes [g_1(\lambda _1) \otimes g_2(\lambda _2, \lambda _3)] ,\right. \\ &\quad \qquad \qquad \left. \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _3+\lambda _3) \right\rangle \\ &\quad \quad \,\, = \left\langle [Dw_2(\tau _1, \tau _2)] \otimes [g_1(\lambda _1) \otimes g_2(\lambda _2, \lambda _3)] ,\right. \\ &\quad \qquad \qquad \left. \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _2+\lambda _3) \right\rangle \,. \end{aligned}$$

Further, using the definition of differentiation and noting that

$$\begin{aligned} D_{\tau _2} \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _2+\lambda _3) = (D_{\lambda _2} + D_{\lambda _3}) \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _2+\lambda _3) \end{aligned}$$

we obtain

$$\begin{aligned} & -\left\langle [w_2(\tau _1, \tau _2)] \otimes [g_1(\lambda _1) \otimes g_2(\lambda _2, \lambda _3)] ,\right. \\ &\quad \qquad \qquad \left. (D_{\tau _1} + D_{\tau _2}) \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _2+\lambda _3) \right\rangle \\ &\quad \quad \,\, = -\left\langle [w_2(\tau _1, \tau _2)] \otimes [g_1(\lambda _1) \otimes g_2(\lambda _2, \lambda _3)] ,\right. \\ &\quad \qquad \qquad \left. (D_{\lambda _1} + D_{\lambda _2} + D_{\lambda _3}) \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _2+\lambda _3) \right\rangle \\ &\quad \quad \,\, = \left\langle w_2(\tau _1, \tau _2) \otimes D[g_1(\lambda _1) \otimes g_2(\lambda _2, \lambda _3)] ,\right. \\ &\quad \qquad \qquad \left. \phi (\tau _1+\lambda _1, \tau _2+\lambda _2, \tau _2+\lambda _3) \right\rangle \end{aligned}$$

or, summarising

$$\begin{aligned} (Dw_2) *(g_1 \otimes g_2) = w_2 *D(g_1 \otimes g_2)\,. \end{aligned}$$
(10.6)

2 Feedback

A powerful technique used in the design of all sorts of systems is feedback. In control systems design, this technique is used to stabilise and adjust the dynamics of a system to achieve a desired behaviour. It’s also used to reduce the sensitivity of systems to poorly controlled parameters. Here we are interested in describing the nonlinearities of a system making use of feedback based on the ones of its constituting subsystems.

Consider the system shown in Fig. 10.3 composed by a forward subsystem \({\mathcal {G}}\) and a feedback subsystem \({\mathcal {H}}\). The input of \({\mathcal {G}}\) is the difference between the input signal x and the signal z, a signal obtained by sensing the output y and suitably processed by \({\mathcal {H}}\). The system is described by the following equations

$$\begin{aligned} e &= x - z\\ z &= (h \circ g)[e]\\ y &= g[e]\,. \end{aligned}$$

Our objective is to obtain the impulse responses of the system based on the ones of \({\mathcal {G}}\) and \({\mathcal {H}}\). We denote the overall system by \({\mathcal {W}}\) and its impulse response of order k by \(w_k\).

Fig. 10.3
figure 3

Weakly nonlinear system with feedback

We start by computing the linear impulse response. The composition of linear systems is obtained by convolving their first order impulse responses. We can therefore write the equation

$$\begin{aligned} e_1 = \delta - z_1 = \delta - h_1 *g_1 *e_1 \end{aligned}$$

and, solving for \(e_1\), we obtain

$$\begin{aligned} e_1 = (\delta + h_1 *g_1)^{*-1}\,. \end{aligned}$$

With \(e_1\) the calculation of the linear impulse response is immediate

$$\begin{aligned} w_1 = g_1 *e_1 = (\delta + h_1 *g_1)^{*-1} *g_1\,. \end{aligned}$$

Its Laplace transform is a classical result of linear system theory

$$\begin{aligned} W_1(s) = \frac{G_1(s)}{1 + H_1(s) G_1(s)}\,. \end{aligned}$$

If in the frequency range of interest the magnitude of the linear loop gain is large \(|H_1(\jmath \omega )G_1(\jmath \omega ) | \gg 1\) then the linear response of the system is almost exclusively determined by the feedback network

$$\begin{aligned} W_1(\jmath \omega ) \approx \frac{1}{H_1(\jmath \omega )}\,. \end{aligned}$$

For completeness, we give the Laplace transform of \(e_1\) as well

$$\begin{aligned} E_1(s_1) = \frac{1}{1 + H_1(s) G_1(s)}\,. \end{aligned}$$

With it the first order transfer function can be written as

$$\begin{aligned} W_1(s_1) = G_1(s_1) E_1(s_1)\,. \end{aligned}$$

Next we compute the impulse response of second order \(W_2\). Using the generalised response of weakly nonlinear systems (10.1), the second order component of the output signal y and of the feedback signal z are given by

$$\begin{aligned} y_2 &= g_2 *e_1^{\otimes 2} + g_1 *e_2\\ z_2 &= h_2 *y_1^{\otimes 2} + h_1 *y_2\,. \end{aligned}$$

Note that, since we used a Dirac impulse as input, the output components \(y_2\) and \(y_1\) correspond to the impulse responses \(w_2\) and \(w_1\) respectively. By substituting the first equation into the second, using the previous result for \(w_1\) and taking into account the fact that the input signal is a one dimensional distribution, we obtain an equation in \(e_2\)

$$\begin{aligned} z_2 = -e_2 = h_2 *(g_1 *e_1)^{\otimes 2} + h_1 *(g_2 *e_1^{\otimes 2} + g_1 *e_2)\,. \end{aligned}$$

whose solution is

$$\begin{aligned} e_2 = -(\delta ^{\otimes 2} + h_1 *g_1)^{*-1} *(h_2 *w_1^{\otimes 2} + h_1 *g_2 *e_1^{\otimes 2})\,. \end{aligned}$$

With \(e_2\) and the previous results for \(e_1\) and \(w_1\) the second order impulse response is thus given by

$$\begin{aligned} w_2 = g_2 *e_1^{\otimes 2} + g_1 *e_2\,. \end{aligned}$$

Its Laplace transform is

$$\begin{aligned} W_2(s_1,s_2) = G_2(s_1,s_2) E_1(s_1) E_1(s_2) + G_1(s_1 + s_2) E_2(s_1,s_2) \end{aligned}$$

with

$$\begin{aligned} E_2(s_1,s_2) & =\\ & - \frac{ H_2(s_1,s_2) W_1(s_1) W_1(s_2) + H_1(s_1 + s_2) G_2(s_1,s_2) E_1(s_1) E_1(s_2)}{1 + H_1(s_1 + s_2) G_1(s_1 + s_2)}\,. \end{aligned}$$

Combining these expressions and using previous results, we can write \(W_2\) in the following form

$$\begin{aligned} W_2(s_1, s_2) = & \Bigl \{ E_1(s_1 + s_2) G_2(s_1, s_2) \nonumber \\ &\qquad - W_1(s_1 + s_2) H_2(s_1, s_2) G_1(s_1) G_1(s_2) \Bigr \} E_1(s_1) E_1(s_2) \end{aligned}$$
(10.7)

which is easily interpretable with the help of the signal flow graph (SFG) shown in Fig. 10.4 (see Appendix A).

Fig. 10.4
figure 4

Signal flow graph of a weakly nonlinear system with feedback

The first term is composed by the transmission of the input signal—we think of it as composed by k tones — through the linear system to node E, the input of the nonlinear subsystem \({\mathcal {G}}\). This part of the signal flow is represented by the factor \(E_1(s_1)E_1(s_2)\). The second order nonlinearity of \({\mathcal {G}}\) then generates a new tone as determined by \(G_2(s_1,s_2)\). This newly generated tone is represented in the SFG by a source node because it is different from the input ones. The propagation of the new tone to the output of the system is accounted for by the last factor, \(E_1(s_1 + s_2)\).

The second summand in (10.7) has a similar interpretation. The input signal first propagates through the linear system to the input of the other nonlinear subsystem \({\mathcal {H}}\). This part of the signal flow is represented by \(G_1(s_1)E_1(s_1)G_1(s_2)E_1(s_2) = W_1(s_1)W_1(s_2)\). The second order nonlinearity of \({\mathcal {H}}\) then generates a new tone as determined and accounted for by the \(H_2(s_1,s_2)\) factor. Finally, the new tone propagates to the output of the system, contributing the last factor, \(-W_1(s_1 + s_2)\).

We now proceed with the calculation of the third order impulse response of the system. The procedure is similar to the one used for the computation of the second order one. From

$$\begin{aligned} y_3 &= g_3 *e_1^{\otimes 3} + 2 g_2 *\left[ e_1 \otimes e_2\right] _{\text {sym}} + g_1 *e_3\\ z_3 &= h_3 *y_1^{\otimes 3} + 2 h_2 *\left[ y_1 \otimes y_2\right] _{\text {sym}} + h_1 *y_3 \end{aligned}$$

and the previous results we obtain an equation for \(e_3\)

$$\begin{aligned} z_3 = -e_3 = h_3 *& (g_1 *e_1)^{\otimes 3}\\ & + 2 h_2 *\left[ (g_1 *e_1) \otimes (g_2 *e_1^{\otimes 2} + g_1 *e_2)\right] _{\text {sym}}\\ &\qquad \qquad + h_1 *(g_3 *e_1^{\otimes 3} + 2 g_2 *\left[ e_1 \otimes e_2\right] _{\text {sym}} + g_1 *e_3) \end{aligned}$$

whose solution is

$$\begin{aligned} e_3 = -(\delta ^{\otimes 3} + h_1 & *g_1)^{*-1} *\Bigl \{ h_3 *(g_1 *e_1)^{\otimes 3} \\ & + 2 h_2 *\left[ (g_1 *e_1) \otimes (g_2 *e_1^{\otimes 2} + g_1 *e_2)\right] _{\text {sym}}\\ & \qquad \qquad \qquad \qquad \qquad \quad + h_1 *(g_3 *e_1^{\otimes 3} + 2 g_2 *\left[ e_1 \otimes e_2\right] _{\text {sym}}) \Bigr \}. \end{aligned}$$

The third order impulse response is obtained by inserting this expression for \(e_3\) and the previous ones for \(e_1\) and \(e_2\) into

$$\begin{aligned} w_3 = g_3 *e_1^{\otimes 3} + 2 g_2 *\left[ e_1 \otimes e_2\right] _{\text {sym}} + g_1 *e_3\,. \end{aligned}$$

As we find the expressions more easily interpretable, we perform this calculation in the Laplace domain. The Laplace transform of the last expressions for \(w_3\) and \(e_3\) are

$$\begin{aligned} W_3(s_1,s_2,s_3) = & G_3(s_1,s_2,s_3) E_1(s_1) E_1(s_2) E_1(s_3)\\ &\quad + 2 \left[ G_2(s_1,s_2+s_3) E_1(s_1) E_2(s_2,s_3)\right] _{\text {sym}}\\ &\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad + G_1(s_1 + s_2 + s_3) E_3(s_1,s_2,s_3) \end{aligned}$$

and

$$\begin{aligned} E_3(s_1,s_2,s_3) & = \frac{-1}{1 + H_1(s_1 + s_2 + s_3) G_1(s_1 + s_2 + s_3)}\\ &\qquad \qquad \bigg \{ H_3(s_1,s_2,s_3) W_1(s_1) W_1(s_2) W_1(s_3) \\ & + 2 \Bigl [H_2(s_1,s_2+s_3) W_1(s_1) \bigl [ G_2(s_2,s_3) E_1(s_2) E_1(s_3)\\ & \qquad \qquad \quad \,\, + G_1(s_2+s_3) E_2(s_2,s_3) \bigr ]\Bigr ]_{\text {sym}}\\ & + H_1(s_1+s_2+s_3) \Bigl [ G_3(s_1,s_2,s_3) E_1(s_1) E_1(s_2) E_1(s_3)\\ & \qquad \qquad \qquad \qquad \qquad \qquad + 2 \left[ G_2(s_1,s_2+s_3) E_1(s_1) E_2(s_2,s_3)\right] _{\text {sym}} \Bigr ] \bigg \} \end{aligned}$$

respectively. Combining these and previous results we can express \(W_3\) as follows

$$\begin{aligned} W_3(s_1, s_2, s_3) = E_1(s_1 + s_2 + s_3) G_3(s_1, s_2, s_3) E_1(s_1) E_1(s_2) E_1(s_3)\nonumber \\ - W_1(s_1 + s_2 + s_3) H_3(s_1, s_2, s_3) W_1(s_1) W_1(s_2) W_1(s_3)\nonumber \\ + 2 W_1(s_1 + s_2 + s_3) H_2(s_1, s_2 + s_3) W_1(s_1)\nonumber \\ \cdot \Big [ W_1(s_2 + s_3) H_2(s_2, s_3) W_1(s_2) W_1(s_3)\nonumber \\ - E_1(s_2 + s_3) G_2(s_2, s_3) E_1(s_2) E_1(s_3) \Big ]_{\text {sym}}\nonumber \\ - 2 E_1(s_1 + s_2 + s_3) G_2(s_1, s_2 + s_3) E_1(s_1)\nonumber \\ \cdot \Big [ H_1(s_2 + s_3) E_1(s_2 + s_3) G_2(s_2, s_3) E_1(s_2) E_1(s_3)\nonumber \\ + E_1(s_2 + s_3) H_2(s_2, s_3) W_1(s_2) W_1(s_3) \Big ]_{\text {sym}}\,. \end{aligned}$$
(10.8)

While this expression is rather long, it can be readily interpreted with the help of the SFG of Fig. 10.4. The first term is composed by the factor \(E_1(s_1) E_1(s_2) E_1(s_3)\) representing the input signal propagating through the linear part of the system to the input of \({\mathcal {G}}\). The third order nonlinearity of \({\mathcal {G}}\) then generates a new tone as witnessed by \(G_3(s_1, s_2, s_3)\). Finally, the newly generated tone propagates through the linear part of the system to the output, \(E_1(s_1 + s_2 + s_3)\).

The second term has a similar structure and represents the contribution to the third order nonlinearity of \({\mathcal {W}}\) by the third order nonlinearity of \({\mathcal {H}}\).

The next summand represents the mixing of the second order nonlinear component of \({\mathcal {H}}\) with the input signal in the second order nonlinearity of \({\mathcal {H}}\) (again). Specifically, thinking of the input signal as composed by three tones, the factors \(W_1(s_1), W_1(s_2)\) and \(W_1(s_3)\) represent the input tones propagating through the linear part of the system to the input of \({\mathcal {H}}\). There, the second and third tones pass through the second order nonlinearity of \({\mathcal {H}}\) generating a new second order tone, \(H_2(s_2, s_3)\). This second order tone then propagates through the linear part of the system to the input of \({\mathcal {H}}\), \(-W_1(s_2 + s_3)\). There the second order tone and the first input tone pass through the second order distortion of \({\mathcal {H}}\) together and generate a new third order tone as witnessed by \(2H_2(s_1, s_2 + s_3)\). Finally, the third order tone propagates through the linear part of the system to the output, \(-W_1(s_1 + s_2 + s_3)\).

The remaining summands have all a similar structure and interpretation as the one just described. They describe the first input tone mixing with a second order tone. The difference between them lies in which subsystem generates the second order tone and which one mixes the first tone with the second order one.

Higher order impulse responses and nonlinear transfer functions of \({\mathcal {W}}\) can be obtained in a similar way. While the expressions become long, they can easily be computed with the help of computer algebra systems (CAS) computer programs and, referring to the SFG in Fig. 10.4, can be interpreted without difficulty.

From the first three nonlinear transfer functions of the feedback based system \({\mathcal {W}}\) we can draw the following conclusions.

  • The nonlinear transfer functions of the constituting subsystems play the role of controlled sources.

  • The linear part of the subsystems plays a pivotal role. It describes the propagation around the system of all input and generated signals.

  • The system \({\mathcal {W}}\) can have an impulse response of order k different from zero even if the impulse responses of order k of both subsystems \({\mathcal {G}}\) and \({\mathcal {H}}\) are zero. In particular, we saw how a third order nonlinearity can be generated by various combinations of the nonlinearities of second order of \({\mathcal {G}}\) and \({\mathcal {H}}\).

  • The nonlinear terms generated exclusively by the forward subsystem \({\mathcal {G}}\) can be suppressed by making the magnitude of the loop gain \(|H_1(s) G_1(s) |\) large (in a suitable portion of the spectrum). This is so because all such terms in the nonlinear transfer function of order k are proportional to

    $$\begin{aligned} E_1(s_1) \cdots E_1(s_k) E_1(s_1 + \cdots + s_k) \end{aligned}$$

    and, as the loop gain is made large, \(E_1\) becomes small.

  • The nonlinear terms generated exclusively by the feedback subsystem \({\mathcal {H}}\) are not suppressed by making the magnitude of the loop gain \(|H_1(s) G_1(s) |\) large. That’s because none of these terms are proportional to the linear component of the error signal \(E_1\). Instead, they are all proportional to

    $$\begin{aligned} W_1(s_1) \cdots W_1(s_k) W_1(s_1 + \cdots + s_k) \end{aligned}$$

    which doesn’t necessarily become small as the loop gain is made large.

  • Nonlinear terms generated by combinations of nonlinearities of \({\mathcal {G}}\) as well as of \({\mathcal {H}}\) include factors in \(E_1\) and therefore do experience some level of suppression at large loop gains.

Example 10.3: Linear Feedback

As a special case we consider a system with linear feedback. This means that all transfer functions of \({\mathcal {H}}\) are zero, except for \(H_1\). In this case the second and third order nonlinear transfer functions of the system are

$$\begin{aligned} \begin{aligned} W_2(s_1,s_2) &= \frac{G_2(s_1,s_2) E_1(s_1) E_1(s_2)}{1 + H_1(s_1 + s_2) G_1(s_1 + s_2)}\\ &= E_1(s_1 + s_2) G_2(s_1,s_2) E_1(s_1) E_1(s_2) \end{aligned} \end{aligned}$$

and

$$\begin{aligned} & W_3(s_1, s_2, s_3) = E_1(s_1 + s_2 + s_3) \Big \{ G_3(s_1, s_2, s_3) \\ & \qquad \qquad \,\, - 2 \big [ G_2(s_1, s_2 + s_3) H_1(s_2 + s_3) E_1(s_2 + s_3) G_2(s_2, s_3) \big ]_{\text {sym}} \Big \}\\ & \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \cdot E_1(s_1) E_1(s_2) E_1(s_3) \end{aligned}$$

respectively. Both of them are proportional to

$$\begin{aligned} E_1(s_1) \cdots E_1(s_k) E_1(s_1 + \cdots + s_k) \end{aligned}$$

and can therefore be suppressed by making the loop gain large.

Example 10.4

We revisit Example 9.5again. Here however we replace the initial condition \(y_0\delta \) by a generic input signal x so that the system equation becomes

$$\begin{aligned} (D\delta + a\delta ) *y = x + c y^2. \end{aligned}$$

Using (10.1) we can rewrite the equation in the following form

$$\begin{aligned} y = (D\delta + a\delta )^{*-1} *(x + c y^2) \end{aligned}$$

which can be interpreted as describing a linear system with nonlinear feedback. The problem can therefore be recast as the problem of finding the nonlinear transfer functions of a system \({\mathcal {W}}\) constituted by the forward subsystem \({\mathcal {G}}\) with linear transfer function

$$\begin{aligned} G_1(s_1) = \frac{1}{s_1 + a} \end{aligned}$$

and a feedback subsystem \({\mathcal {H}}\) described by the second order nonlinear transfer function

$$\begin{aligned} H_2(s_1, s_2) = -c \end{aligned}$$

as shown in Fig. 10.5. Note that we have assumed negative feedback for consistency with our general treatment. This last expression is obtained by specialising the general expression \(c y^2\) to an input signal having only a one dimensional component \(y_1\)

$$\begin{aligned} c y_1^2 = c y_1^{\otimes 2} = c \delta ^{\otimes 2} *y_1^{\otimes 2}. \end{aligned}$$

The obtained expression clearly describes a system whose only impulse response differing from zero is the second order one \(h_2 = c \delta ^{\otimes 2}\) (see also Example 10.2).

Fig. 10.5
figure 5

Signal flow graph for the system of Example 10.4

In this formulation of the problem the solution is found by inserting the above expressions for the transfer functions of the subsystems into Eqs. (10.7) and (10.8). The obtained expressions obviously agree with the ones obtained in Example 9.5 by calculation from the convolution equation.

3 Linearisation

Many systems are designed based on the theory of linear systems and the deviation from linear behavior in practical implementations is undesired. For this reason in practical implementations one often tries to minimise the responses of order higher than one. In this section we investigate the possibility of suppressing higher order responses by preceding the system in question \({\mathcal {H}}\) with another system \({\mathcal {G}}\) or by following it with a system \({\mathcal {K}}\).

We call a system \({\mathcal {K}}\) designed to suppress all nonlinear transfer functions of \({\mathcal {K \circ H}}\) up to order kpost-lineariser of order k and a system \({\mathcal {K}}\) suppressing all responses of \({\mathcal {K \circ H}}\) of order higher than one a post-lineariser. Similarly, we call a system \({\mathcal {G}}\) designed to suppress all nonlinear transfer functions of \({\mathcal {H \circ G}}\) up to order kpre-lineariser of order k and a system \({\mathcal {G}}\) suppressing all responses of \({\mathcal {H \circ G}}\) of order higher than one a pre-lineariser or pre-distorter.

We first investigate post-linearisers. The first requirement is that the system \({\mathcal {K}}\) should not change the linear response of \({\mathcal {H}}\). This is only the case if the linear impulse response of \({\mathcal {K}}\) is a Dirac impulse

$$\begin{aligned} k_1 = \delta \,. \end{aligned}$$

Next, we look for a condition to suppress the response of second order. Referring to Table 10.2 we see that the second order response of \({\mathcal {K}} \circ {\mathcal {H}}\) disappears if

$$\begin{aligned} (k \circ h)_2 = k_1 *h_2 + k_2 *h_1^{\otimes 2} = 0\,. \end{aligned}$$

Therefore, if \(h_1\) has an inverse, we can make \((k \circ h)_2\) disappear by choosing

$$\begin{aligned} k_2 = -h_2 *(h_1^{\otimes 2})^{*-1}\,. \end{aligned}$$
(10.9)

In the Laplace domain this is

$$\begin{aligned} K_2(s_1,s_2) = -\frac{H_2(s_1,s_2)}{H_1(s_1)H_1(s_2)}\,. \end{aligned}$$
(10.10)

Next we look for a condition to suppress on top of \((k \circ h)_2\) also \((k \circ h)_3\). Referring again to Table 10.2 we find the following condition

$$\begin{aligned} (k \circ h)_3 = k_1 *h_3 + 2 \, k_2 *\left[ h_1 \otimes h_2\right] _{\text {sym}} + k_3 *h_1^{\otimes 3} = 0\,. \end{aligned}$$

As for the second order, this equation can be solved for \(k_3\) only if \(h_1\) has an inverse, in which case, using the previously obtained values for \(k_1\) and \(k_2\), we find

$$\begin{aligned} k_3 = \bigl ( -h_3 + 2 h_2 *(h_1^{\otimes 2})^{*-1} *\left[ h_1 \otimes h_2\right] _{\text {sym}} \bigr ) *(h_1^{\otimes 3})^{*-1} \end{aligned}$$
(10.11)

with Laplace transform

$$\begin{aligned} K_3(s_1, s_2, s_3) = \frac{-H_3(s_1, s_2, s_3) + 2 \left[ \frac{H_2(s_1, s_2 + s_3)}{H_1(s_2 + s_3)} H_2(s_2, s_3)\right] _{\text {sym}}}{H_1(s_1) H_1(s_2) H_1(s_3)}\,. \end{aligned}$$
(10.12)

This procedure can be extended to find the transfer functions of \({\mathcal {K}}\) up to order j such that they cancel the nonlinear responses of \({\mathcal {K \circ H}}\) up to the jth order. The condition for the existence of \(k_j\) is always the same: the existence of the inverse of \(h_1\). This is so because in each equation \((k \circ h)_j = 0\), \(k_j\) appears convolved with \(h_1^{\otimes k}\). If we let j tend to infinity we obtain a post-lineariser suppressing all nonlinear responses of \({\mathcal {H}}\).

The impulse responses of a pre-lineariser \({\mathcal {G}}\) can be obtained following a similar procedure. To preserve the response of \({\mathcal {H}}\), its linear response must be a Dirac impulse as for a post-lineariser

$$\begin{aligned} g_1 = \delta . \end{aligned}$$

The second order response of \({\mathcal {H \circ G}}\) disappears if

$$\begin{aligned} g_2 = - h_1^{*-1} *h_2 \end{aligned}$$
(10.13)

or, expressed in the Laplace domain, if

$$\begin{aligned} G_2(s_1,s_2) = -\frac{H_2(s_1,s_2)}{H_1(s_1 + s_2)}\,. \end{aligned}$$
(10.14)

The third order response of \({\mathcal {H \circ G}}\) disappears if

$$\begin{aligned} g_3 = h_1^{*-1} *\bigl ( -h_3 + 2 h_2 *\left[ \delta \otimes (h_1^{*-1} *h_2)\right] _{\text {sym}} \bigr ) \end{aligned}$$
(10.15)

whose Laplace transform is

$$\begin{aligned} G_3(s_1, s_2, s_3) = \frac{-H_3(s_1, s_2, s_3) + 2 \left[ H_2(s_1, s_2 + s_3) \frac{H_2(s_2, s_3)}{H_1(s_2 + s_3)}\right] _{\text {sym}}}{H_1(s_1 + s_2 + s_3)} \end{aligned}$$
(10.16)

and so on. Again, the prerequisite for the existence of these solutions is the existence of the inverse of \(h_1\). Note also that in general the transfer functions of a pre-lineariser are different from the ones of a post-lineariser.

In summary, we can state that a weakly nonlinear system can be linearised with a pre- or a post-lineariser only if its linear transfer function has a stable inverse in the convolution algebra of interest. In the convolution algebra of right sided distributions this means the existence of a causal and stable inverse.

A generic linear system may not have an inverse. For example, if the linear impulse response \(h_1\) is a right-sided, indefinitely differentiable function then \(h_1 *w\) is an indefinitely differentiable function independently from the choice of w. This means that \(h_1 *w = \delta \) has no solution and hence \(h_1\) has no inverse.

A class of systems of special interest to us is the class of causal systems whose transfer functions are rational functions

$$\begin{aligned} H_1(s) = \frac{N(s)}{P(s)}\,. \end{aligned}$$

For this class of systems \(H_1(s)\) is stable and has a causal stable inverse if all poles and zeros of \(H_1(s)\) are in the left-half of the complex plane.

Example 10.5: Memory-less System Linearisation

In this example we consider a third order memory-less system \({\mathcal {H}}\) with impulse responses

$$\begin{aligned} h_1 = a_1\delta \qquad h_2 = 0 \qquad h_3 = -a_3\delta ^{\otimes 3}\,. \end{aligned}$$

We would like to find a pre-lineariser \({\mathcal {G}}\) suppressing the responses of third order.

The linear impulse response of the system has an inverse

$$\begin{aligned} a_1 \delta *\frac{1}{a_1}\delta = \delta . \end{aligned}$$

Therefore it can be linearised using the results of this section. As \(h_2 = 0\), the second-order impulse response of the pre-lineariser must also vanish

$$\begin{aligned} g_2 = 0\,. \end{aligned}$$

The third order impulse response of the pre-lineariser is obtained by applying (10.15) and we find

$$\begin{aligned} g_3 = \frac{1}{a_1}\delta *a_3\delta ^{\otimes 3} = \frac{a_3}{a_1}\delta ^{\otimes 3}\,. \end{aligned}$$

Note that while the pre-lineariser suppresses responses of third order, it does introduce responses of higher order

$$\begin{aligned} \begin{aligned} h \circ g &= a_1\delta *(\delta + \frac{a_3}{a_1}\delta ^{\otimes 3}) - a_3\delta ^{\otimes 3} *(\delta + \frac{a_3}{a_1}\delta ^{\otimes 3})^3\\ &= a_1\delta - 3 \frac{a_3^2}{a_1}\delta ^{\otimes 5} - 3 \frac{a_3^3}{a_1^2}\delta ^{\otimes 7} - \frac{a_3^4}{a_1^3}\delta ^{\otimes 9}\,. \end{aligned} \end{aligned}$$

It’s easy to see that to suppress the nonlinear responses up to order k the pre-lineariser must be of order k. To suppress them all a full pre-lineariser is needed.

4 System Manipulations

In this section we highlight some properties of weakly nonlinear systems that allow us to manipulate weakly nonlinear system composed by sub-systems in such a way as to obtain different interconnections of the sub-systems without changing the behavior of the overall system.

The first property that we discuss is the associativity of addition which comes from the fact that \({\mathcal {D}}'_{\oplus ,\text {sym}}\) is a vector space. Thus if fg and h are three weakly nonlinear systems driven by the same input signal x, the ways in which the outputs are summed is irrelevant

$$\begin{aligned} (f[x] + g[x]) + h[x] = f[x] + (g[x] + h[x]) = f[x] + g[x] + h[x]\,. \end{aligned}$$

The same is true for the product of the output signals

$$\begin{aligned} (f[x] \cdot g[x]) \cdot h[x] = f[x] \cdot (g[x] \cdot h[x]) = f[x] \cdot g[x] \cdot h[x]\,. \end{aligned}$$

This is the case because the product that we defined on \({\mathcal {D}}'_{\oplus ,\text {sym}}\) is defined in terms of the tensor product and the latter is associative.

A second important property is commutativity. Addition is always commutative, therefore the order in which the signal appears as input to adders is irrelevant

$$\begin{aligned} f[x] + g[x] = g[x] + f[x]\,. \end{aligned}$$

While the tensor product is not commutative the symmetrised tensor product is and with it the product in \({\mathcal {D}}'_{\oplus ,\text {sym}}\)

$$\begin{aligned} f[x] \cdot g[x] = g[x] \cdot f[x]\,. \end{aligned}$$

Thus the order in which the signals appearing as input to multipliers is irrelevant as well. In fact, because it’s cumbersome to draw symmetrised block diagrams, we will generally draw unsymmetrised block diagrams and, if not stated explicitly, imply symmetrisation.

A further equivalence of block diagrams comes from the distributivity of the product over addition

$$\begin{aligned} (f[x] + g[x]) \cdot h[x] &= f[x] \cdot h[x] + g[x]\cdot h[x]\,,\\ f[x] \cdot (g[x]) + h[x]) &= f[x] \cdot g[x] + f[x]\cdot h[x]\,. \end{aligned}$$

This property originates from the multi-linearity of the tensor product. A block diagram representation of the first equality is shown in Fig. 10.6.

Fig. 10.6
figure 6

Distributivity of WNTI systems

Another equivalence is given by the equation

$$\begin{aligned} (g \circ f)[x] + (h \circ f)[x] = (g + h) \circ f[x]\,. \end{aligned}$$

To prove the validity of this equation we prove its validity for terms of each order individually. To simplify the expressions, let’s denote the sum of all lth tensor products resulting in a distribution of order k by

$$\begin{aligned} f_k^{(l)} :=\sum _{\begin{array}{c} |\alpha | = l\\ |\kappa \alpha |=k \end{array}} \left[ f^{\otimes \alpha }\right] _{\text {sym}} \end{aligned}$$

with \(\alpha \) a multi-indexe in \({\mathbb {N}}^k\) and \(\kappa =(1,2,\ldots ,k)\). With this notation the kth order impulse responses of the summands on the left-hand side can be written as

$$\begin{aligned} (g \circ f)_k = \sum _{l=1}^k g_l *f_k^{(l)}\,, \qquad (h \circ f)_k = \sum _{l=1}^k h_l *f_k^{(l)}\,. \end{aligned}$$

The two can be combined using the distributivity of convolution (3.13) to obtain

$$\begin{aligned} \sum _{l=1}^k (g_l + h_l) *f_k^{(l)} \end{aligned}$$

which is the kth order impulse response of the expression on the right-hand side.

The last useful property in manipulating block diagrams is the right distributivity of composition

$$\begin{aligned} (g \circ f)[x] \cdot (h \circ f)[x] = (g \cdot h) \circ f[x]\,. \end{aligned}$$

We prove again this equality by proving its validity for terms of each order individually. The impulse response of order k on the left-hand side is

$$\begin{aligned} \sum _{i+j=k} \sum _{l=1}^i g_l *f_i^{(l)} \sum _{m=1}^j h_m *f_j^{(m)}\,. \end{aligned}$$

Hence, dropping symmetrisation operators for simplicity of notation

$$\begin{aligned} \sum _{i+j=k} \sum _{l+m \le k} (g_l *f_i^{(l)}) \otimes (h_m *f_j^{(m)}) &= \sum _{i+j=k} \sum _{l+m \le k} (g_l \otimes h_m) *(f_i^{(l)} \otimes f_j^{(m)})\\ &= \sum _{s=1}^k \sum _{l+m = s} (g_l \otimes h_m) *f_k^{(s)} \end{aligned}$$

which corresponds to the kth order impulse response of the right-hand side of the equation. A block diagram representation of the property is shown in Fig. 10.7.

Fig. 10.7
figure 7

Right distributivity of composition of WNTI systems. The empty circle represents either a sum or a product

5 Structure

A review of our development of the theory of weakly nonlinear systems up to this point reveals that weakly nonlinear systems arise out of stable linear systems and multipliers. In particular, multipliers are the only mean by which we can combine linear systems to produce systems of higher order.Footnote 1 In this section we investigate the overall structure of systems constructed this way.

Let’s start by considering the most generic impulse response of second-order that can be constructed out of a single multiplier and linear systems \(h_A, h_B\) and \(h_C\)

$$\begin{aligned} h_2(\tau _1,\tau _2) = \left[ h_C *(h_A \otimes h_B)\right] _{\text {sym}}. \end{aligned}$$

The block diagram of a system whose only impulse response is \(h_2\) is shown in Fig. 10.8a. We call a system whose only impulse response is \(h_i\)monomial system of ith order.

In Sect. 3.3 we showed that every distribution can be approximated to arbitrary accuracy by a set of weighted Dirac impulses. We can thus approximate the linear system \(h_C\) by

$$\begin{aligned} h_C(\tau ) \approx \sum _{n=0}^N c_n \delta (\tau - \lambda _n) ,\qquad c_n\in {\mathbb {C}},\quad \lambda _n \in [0,\infty ). \end{aligned}$$

where we assume the system to be causal. Using this approximation in \(h_2\) we obtain

$$\begin{aligned} h_2(\tau _1, \tau _2) \approx \sum _{n=0}^N \left[ c_n \delta (\tau _1 - \lambda _n) *\bigl (h_A(\tau _1) \otimes h_B(\tau _2)\bigr )\right] _{\text {sym}}. \end{aligned}$$

The shifting property of convolution (3.16) extends to convolutions between distributions of different dimensions in a similar way as the differentiation rule (10.6). In particular for the one dimensional convolution \(f_1\) and the ith dimensional one \(g_i\) we have

$$\begin{aligned} f_1(\tau _1 - \lambda ) *g_i(\tau _1,\dotsc ,\tau _i) = f_1(\tau _1) *g_i(\tau _1-\lambda ,\dotsc ,\tau _i-\lambda ). \end{aligned}$$

Using this property the response of the system can be expressed as

$$\begin{aligned} y_2(\tau _1,\tau _2) = h_2 & (\tau _1, \tau _2) *\bigl (x(\tau _1) \otimes x(\tau _2)\bigr ) \\ & \qquad \,\,\approx \sum _{n=0}^N c_n \left[ h_A(\tau _1) \otimes h_B(\tau _2)\right] _{\text {sym}} *\bigl (x(\tau _1 - \lambda _n) \otimes x(\tau _2 - \lambda _n)\bigr ). \end{aligned}$$

This shows that all delays required to approximate \(h_C\) to any desired accuracy can be moved to delays of the input signal as illustrated in Fig. 10.8b.

Fig. 10.8
figure 8

\({\textbf {a}}\) Block diagram of the most generic monomial system of second-order constructed with a single multiplier and linear systems \(h_A, h_B\) and \(h_C\). \({\textbf {b}}\) Approximation of the system in Fig. 10.8 a

If we use a similar approximation for \(h_A\) and \(h_B\) we obtain

$$\begin{aligned} & y_2(\tau _1,\tau _2) \\ & \qquad \quad \approx \sum _{n_c=0}^{N_c} \sum _{n_a=0}^{N_b} \sum _{n_b=0}^{N_a} c_{n_c} \left[ a_{n_a} b_{n_b}\right] _{\text {sym}} x(\tau _1 - \lambda (n_a + n_c)) \otimes x(\tau _2 - \lambda (n_b+n_c)) \end{aligned}$$

where we have assumed the use of equal and uniform delays for all sub-systems.

Monomial systems of higher order can be constructed in a similar way by combining linear systems and more multipliers. If we approximate all linear sub-systems as we did above for the second-order monomial system, it’s easy to see that all delays can be moved to the input of the system. A system of order K is the sum of monomial sub-systems of order up to K. Therefore, weakly nonlinear systems of finite order can be represented as composed by two sections: An input tapped delay line sub-system that represents the memory of the system and a memoryless sub-system composed by adders and multipliers as illustrated in Fig. 10.9.

An estimate for the maximum delay necessary to faithfully represent a given system of order K can be obtained from the sampling theorem (see Example 12.5): If the maximum frequency component of the input signal is \(f_{\text {max}}\), then the highest frequency at the output of the system is \(K f_{\text {max}}\) and the delay must be bounded by

$$\begin{aligned} \lambda < \frac{1}{2\,K f_{\text {max}}}. \end{aligned}$$

The number of taps depends on the amount of memory of the linear sub-systems to be approximated.

Fig. 10.9
figure 9

Conceptual structure of a WNTI system

The system structure represented in Fig. 10.9 is not the most economical one. A comparison between Fig. 10.8a and b reveals that if one moves all the system memory to the input of the system then one needs a larger number of multipliers than by distributing the memory across sub-systems. This is entirely analogous to the trade-off in the implementation of discrete time filters as finite-impulse response (FIR) versus infinite-impulse response (IIR) filters.