The objective of this chapter is to show that the solution of ordinary differential equations, if based on distributions as opposed to functions, can be obtained by (mostly) algebraic methods. These methods are rigorous forms of the so-called Heaviside’s operational or symbolic calculus. The close relationship to the integral transforms that convert convolution into the ordinary multiplication is also shown.

FormalPara ! Notation

With this chapter we stop using uppercase letters such as T to denote distributions. Instead, we start using lowercase letters such as the ones typically used to denote functions, for example f. We also adopt the convention of denoting the Laplace transform of a distribution, say f, with the same letter, but changed to uppercase, e.g. , \(F = {\mathcal {L}}\{f\}\). When we need to distinguish between the ordinary and the distributional differential operator, we will in general denote the former by \(\frac{\textrm{d}^{}}{\textrm{d}t^{}}\) and continue to denote the latter by \(D\).

1 Convolution Algebra

An algebra \({\mathcal {A}}\) is a vector space together with an associative product \(\odot \) such that multiplication of any two vectors produces another vector in \({\mathcal {A}}\) and such that for any constants ab and any vectors \(f,g,h \in {\mathcal {A}}\) the following distributivity laws are valid

$$\begin{aligned} (a f + b g) \odot h = & {} a(f \odot h) + b(g \odot h) \end{aligned}$$
(7.1)
$$\begin{aligned} f \odot (a g + b h) = & {} a(f \odot g) + b(f \odot h)\,. \end{aligned}$$
(7.2)

The convolution product seems like an adequate product to make an algebra out of distributions. Unfortunately, as we saw, the convolution product is not defined for arbitrary distributions. The solution is to restrict the set of distributions to a vector subspace of \({\mathcal {D'}}\) on which the convolution is well-defined.

Definition 7.1

(Convolution algebra) A  convolution algebra \({\mathcal {A'}}\) is a vector subspace of \({\mathcal {D'}}\) with the following properties:

  • The convolution product is associative.

  • \({\mathcal {A'}}\) with the convolution product forms an algebra.

  • \(\delta \) is in \({\mathcal {A'}}\).

A convolution algebra is thus an algebra with a unit and for which the product is always commutative. We also note that the triple \(({\mathcal {A'}},+,*)\) forms a commutative ring.

We have already met three examples of convolution algebras: (i) the set of right-sided distributions \({\mathcal {D_+'}}\), (ii) the set of periodic distributions and (iii) the set of distributions with compact support \({\mathcal {E'}}\).

2 Convolution Equations

In this section we study convolution equations. We will see that they provide a framework for studying a broad class of systems that is the time-domain counterpart of one based on the Laplace transform.

A convolution equation is an equation of the form

$$\begin{aligned} g *y = x \end{aligned}$$
(7.3)

with g and x given distributions and y a distribution to be determined. In this section we assume gx and y to be elements of a convolution algebra \({\mathcal {A'}}\). Suppose that g has an inverse in \({\mathcal {A'}}\), that is, there is an element denoted by \(g^{*-1} \in {\mathcal {A'}}\) such that

$$\begin{aligned} g *g^{*-1} = g^{*-1} *g = \delta \,. \end{aligned}$$

Then \(g^{*-1} *x\) is a solution of the equation for any x, since

$$\begin{aligned} y = g^{*-1} *g *y = g^{*-1} *x\,. \end{aligned}$$

Note that if there is an inverse \(g^{*-1}\) then it must be unique, since if \(g_1^{*-1}\) is another inverse we have

$$\begin{aligned} g *(g^{*-1} - g_1^{*-1}) = (g *g^{*-1}) - (g *g_1^{*-1}) = 0 \end{aligned}$$

Conversely, suppose that (7.3) has a solution for any right-hand x. Then it has a solution for \(x = \delta \) and the solution is by definition the inverse of g. Consequently, we can say that, if g has an inverse in \({\mathcal {A'}}\), then the equation has a unique solution for any right-hand side x and the solution is

$$\begin{aligned} y = g^{*-1} *x\,. \end{aligned}$$
(7.4)

Therefore, knowledge of \(g^{*-1}\) permits to find the solution of (7.3) for any right-hand side x. For this reason \(g^{*-1}\) is called the elementary or fundamental solution of the convolution equation.

Note that if g has an inverse \(g^{*-1}\), but it’s not an element of the convolution algebra \({\mathcal {A'}}\), then the expression \(g^{*-1} *x\) may not exist and \(g^{*-1} *g *y\) may not be associative (see Example 3.5). Hence, (7.4) can not be proved to be equivalent to (7.3).

Suppose that \(g_1\) and \(g_2\) are two elements of the convolution algebra \({\mathcal {A'}}\) having inverses \(g_1^{*-1}\) and \(g_2^{*-1}\), respectively. Then their convolution product \(g_1 *g_2\) has an inverse as well and it is given by

$$\begin{aligned} (g_1 *g_2)^{*-1} = g_1^{*-1} *g_2^{*-1} \end{aligned}$$
(7.5)

for

$$\begin{aligned} (g_1 *g_2)^{*-1} *(g_1 *g_2) = & {} \delta \\ = & {} g_1 *g_1^{*-1} *g_2 *g_2^{*-1} \\ = & {} (g_1^{*-1} *g_2^{*-1}) *(g_1 *g_2)\,. \end{aligned}$$

From this we see that, if in (7.3) g can be represented as the convolution product of m invertible elements \(g_i, i=1,\ldots ,m\), then the solution of the equation can be expressed as the convolution product of their inverses

$$\begin{aligned} y = g_1^{*-1} *\ldots *g_m^{*-1} *x\,. \end{aligned}$$
(7.6)

In every algebra with a unit, one can perform a partial fraction expansion and every convolution algebra has a unit by definition. Therefore, every convolution product of inverses can be represented as a sum of inverses.

Example 7.1: Partial Fraction Expansion

Consider the following convolution product

$$\begin{aligned} (D\delta + a\delta )^{*-1} *(D\delta - b\delta )^{*-1} \end{aligned}$$

with a and b different constants. Its partial fraction expansion has the form

$$\begin{aligned} c_a (D\delta + a\delta )^{*-1} + c_b (D\delta - b\delta )^{*-1} \end{aligned}$$

with \(c_a\) and \(c_b\) constants to be determined. If we take the convolution of both expressions with

$$\begin{aligned} (D\delta + a\delta ) *(D\delta - b\delta ) \end{aligned}$$

we obtain the following equation

$$\begin{aligned} \delta = c_a (D\delta - b\delta ) + c_b (D\delta + a\delta )\,. \end{aligned}$$

Equating the coefficients of \(\delta \) and \(D\delta \) we obtain two equations for \(c_a\) and \(c_b\) whose solution is

$$\begin{aligned} c_b = -c_a = \frac{1}{a+b}\,. \end{aligned}$$

A (convolution) algebra is said to be free from zero divisors if

$$\begin{aligned} g_1 *g_2 = 0 \end{aligned}$$

implies that either \(g_1 = 0\) or \(g_2 = 0\). In this case the algebra is called an integral domain and a convolution equation with common factors on both sides of the equation can be simplified. For example, assuming \(f \ne 0\), the equation

$$\begin{aligned} f *g *y = f *x \end{aligned}$$

can be simplified to

$$\begin{aligned} g *y = x\,. \end{aligned}$$

In fact, the original equation can be written as

$$\begin{aligned} f *(g *y - x) = 0 \end{aligned}$$

and since f is different from zero, we can deduce the simplified form.

We will see that the convolution algebra of right-sided distributions \({\mathcal {D_+'}}\) is an integral domain. The algebra of periodic distributions \({\mathcal {D'}}({\mathbb {T}})\) is not.

3 Initial Value Problems

In this section we want to apply the results of the previous section to study initial value problems. In particular let L denote the linear differential operator with constant coefficients of order m

$$\begin{aligned} L = D^m + a_{m-1}D^{m-1} + \cdots + a_1D+ a_0 \end{aligned}$$

where for convenience we have set \(a_m=1\). We are interested in the solution of the differential equation

$$\begin{aligned} L y(t) = x(t) \end{aligned}$$
(7.7)

for \(t \ge 0\) with initial conditions

$$\begin{aligned} (D^k y)(0) = y_k \qquad k=0,\ldots ,m-1 \end{aligned}$$
(7.8)

x and y functions and differentiation intended in the usual sense of differentiation of functions.

As a first step in translating this problem into the language of distributions, we note that the convolution algebra \({\mathcal {D_+'}}\) is well suited for the study of initial value problems. Every element of the algebra can be thought of as being in a zero state for \(t < 0\) and representing some excitation or state evolution for \(t \ge 0\). The functions x and y can be associated with distributions of \({\mathcal {D_+'}}\) by extending them to negative values of t where we assign them the value of zero. To make this explicit it’s usual to show them multiplied by the unit step \(\textsf{1}_{+}\).

The second step is to perform differentiation in the sense of distributions. With the results of Example 2.9, for the first derivative of \(\textsf{1}_{+}y\) we have

$$\begin{aligned} D(\textsf{1}_{+}(t) y(t)) = \textsf{1}_{+}(t) Dy(t) + y_0\delta \end{aligned}$$

and similarly, for the higher order derivatives

$$\begin{aligned} D^2(\textsf{1}_{+}(t) y(t)) = & {} \textsf{1}_{+}(t) D^2 y(t) + y_0D\delta + y_1\delta \\ & \ldots & \\ D^m(\textsf{1}_{+}(t) y(t)) = & {} \textsf{1}_{+}(t) D^m y(t) + y_0D^{m-1}\delta + \cdots + y_{m-1}\delta \,. \end{aligned}$$

Note that in all these expressions the first term on the right-hand side is the conventional derivative of the function y (multiplied by \(\textsf{1}_{+}\)). Putting these results in the differential equation we obtain an equivalent equation for the distribution \(\textsf{1}_{+}y\)

$$\begin{aligned} L (\textsf{1}_{+}y) = & {} \textsf{1}_{+}L y + \sum _{k=0}^{m-1} \sigma _k D^k\delta \\ = & {} \textsf{1}_{+}x + \sum _{k=0}^{m-1} \sigma _k D^k\delta \end{aligned}$$

with

$$\begin{aligned} \sigma _k = & {} a_{1+k}y_0 + a_{2+k}y_1 + \cdots + y_{m-k-1} \nonumber \\ = & {} \sum _{i=0}^{m-1-k} a_{i+1+k}y_i\,, \qquad k=0,\dotsc ,m-1 \end{aligned}$$
(7.9)

and \(a_m = 1\).

The last step required to translate the initial value problem into a convolution equation is to use the fact that the kth order derivative of a distribution can be expressed as the convolution product with \(D^k\delta \) so that

$$\begin{aligned} L(\textsf{1}_{+}y) = L \delta *\textsf{1}_{+}y\,. \end{aligned}$$

The initial value problem defined by (7.7) and (7.8) is therefore equivalent to the following convolution equation of distributions

$$\begin{aligned} L \delta *\textsf{1}_{+}y = \textsf{1}_{+}x + \sum _{k=0}^{m-1} \sigma _k D^k\delta \,. \end{aligned}$$
(7.10)

With the results of the previous section, if the distribution \(L\delta \) has an inverse in \({\mathcal {D_+'}}\) (the elementary solution of the equation), the solution of the equation for arbitrary right-hand side \(\textsf{1}_{+}x\) and initial conditions is given by

$$\begin{aligned} \textsf{1}_{+}y = (L\delta )^{*-1} *\textsf{1}_{+}x + \sum _{k=0}^{m-1} \sigma _k D^k\left[ (L\delta )^{*-1}\right] \,. \end{aligned}$$
(7.11)

It’s worth highlighting two important points. The first one is the fact that the differential equation (7.7) is not a full description of the problem. To fully specify the problem it has to be accompanied by the initial conditions expressed by Eq. (7.8). Differently from this, the convolution Eq. (7.10) is a full description of the problem.

The second point that we want to highlight is the fact that (7.11) is a global solution of the problem, that is, the solution is specified for all times. Differently from this, the classical solution of the original initial value problem is a function only valid for \(t \ge 0\).

Next we show that the inverse \((L\delta )^{*-1}\) exists. To this end note that if we insert it in (7.10) and set \(x=0\) as well as \(\sigma _0=1\) and \(\sigma _k=0\), \(k=1,\ldots ,m-1\) we obtain the equation defining the inverse

$$\begin{aligned} L\delta *(L\delta )^{*-1} = \delta \,. \end{aligned}$$

The inverse of \(L\delta \) is thus the distribution \(\textsf{1}_{+}e\) with e the function which is the solution of the homogeneous equation

$$\begin{aligned} L e(t) = 0 \end{aligned}$$

with initial conditions

$$\begin{aligned} D^{m-1}e(t) = 1 \qquad \text {and} \qquad D^ke(t) = 0\,, \quad k=0,\ldots ,m-2\,. \end{aligned}$$

Example 7.2: Fundamental Solution

Consider the differential operator

$$\begin{aligned} L = D+ a . \end{aligned}$$

The solution of the homogeneous differential equation \(Le(t) = 0\) with initial condition \(e(0) = 1\) is

$$\begin{aligned} e(t) = e^{-at}\,. \end{aligned}$$

The inverse of \(L\delta \) in the convolution algebra \({\mathcal {D_+'}}\) is therefore

$$\begin{aligned} (L\delta )^{*-1} = (D\delta + a\delta )^{*-1} = \textsf{1}_{+}(t) e^{-at}\,. \end{aligned}$$

This is easily verified by inserting it into the convolution equation for the operator L

$$\begin{aligned} L\delta *(L\delta )^{*-1} = & {} (D\delta + a\delta ) *\textsf{1}_{+}(t) e^{-at} = D(\textsf{1}_{+}(t) e^{-at}) + a \textsf{1}_{+}(t) e^{-at} \\ = & {} -a \textsf{1}_{+}(t) e^{-at} + \delta + a \textsf{1}_{+}(t) e^{-at} = \delta \,. \end{aligned}$$

In a similar way we find

$$\begin{aligned} (D\delta + a)^{-m} = & {} \textsf{1}_{+}(t) \frac{t^{m-1}}{(m-1)!} e^{-at} \end{aligned}$$

with m a positive natural number.

Let’s focus for a moment on the distribution \(L\delta \) and observe that it looks like a polynomial P with \(D\delta \) playing the role of the independent variable

$$\begin{aligned} L\delta = D^m\delta + a_{m-1}D^{m-1}\delta + \cdots + a_1D\delta + a_0\delta \,. \end{aligned}$$

Any polynomial can be represented as a product of factors

$$\begin{aligned} P(z) = (z - z_1)(z - z_2) \cdots (z - z_m) \end{aligned}$$

with \(z_j\) the zeros that may or may not be distinct. From this and remembering that

$$\begin{aligned} D^k\delta *D^i\delta = D^{k+i}\delta \end{aligned}$$

we deduce that the distribution \(L\delta \) can be factored in a similar way. If we denote by \(f^{*k}\) the convolution product of \(k \ge 0\) distributions equal to f with \(f^{*0} = \delta \) and group common factors, then \(L\delta \) can be represented as

$$\begin{aligned} L\delta = (D\delta - z_1\delta )^{*l_1} *(D\delta - z_2\delta )^{*l_2} *\cdots *(D\delta - z_n\delta )^{*l_n} \end{aligned}$$

with \(l_j\) the multiplicity of the jth factor. The inverse \((L\delta )^{*-1}\) can then also be factored

$$\begin{aligned} (L\delta )^{*-1} = (D\delta - z_1\delta )^{*- l_1} *(D\delta - z_2\delta )^{*- l_2} *\cdots *(D\delta - z_n\delta )^{*- l_n} \end{aligned}$$

\(f^{*- k}\) denoting the inverse of \(f^{*k}\). With this factorization the elementary solution can either be directly expressed as a convolution product

$$\begin{aligned} \textsf{1}_{+}(t) \, e(t) = \textsf{1}_{+}(t) \frac{t^{l_1-1}}{(l_1-1)!} e^{z_1 t} *\cdots *\textsf{1}_{+}(t) \frac{t^{l_n-1}}{(l_n-1)!} e^{z_n t} \end{aligned}$$

or, by first performing a partial fraction expansion, can be expressed as a sum of convolution-free known distributions.

To show the relation to the Laplace method, we Laplace transform Eq. (7.10). The Laplace transform of the distribution \(L\delta \) becomes a true polynomial in the variable s and the convolution product becomes the conventional multiplication so that the convolution equation becomes an algebraic equation

$$\begin{aligned} P(s) \, Y(s) = & {} X(s) + \sum _{k=0}^{m-1} \sigma _k s^k \\ P(s) = & {} (s^m + a_{m-1}s^{m-1} + \cdots + a_1s + a_0)\\ = & {} (s - z_1)^{l_1} (s - z_2)^{l_2} \cdot \cdots \cdot (s - z_n)^{l_n}\,. \end{aligned}$$

The Laplace transformed of the inverse \((L\delta )^{*-1}\) is the reciprocal of P(s) and corresponds to the Laplace transform of the elementary solution e

$$\begin{aligned} E(s) = \frac{1}{P(s)} . \end{aligned}$$

With it the solution of the convolution equation can be written as

$$\begin{aligned} Y(s) = E(s) X(s) + E(s) \sum _{k=0}^{m-1} \sigma _k s^k\,. \end{aligned}$$

The solution y of the original equation is then found by inverse Laplace transforming Y. In most cases this is most conveniently accomplished by partial fraction expansion.

This shows the parallelism between convolution equations in \({\mathcal {D_+'}}\) on one side and the Laplace transform method on the other one. In particular the distribution \(D\delta \) is the time-domain counterpart of the variable s, the convolution product the counterpart of the ordinary multiplication and \(\delta \) the one of the multiplicative unit element 1.

Example 7.3

Consider the differential equation

$$\begin{aligned} \left[ D^2 + (a - b)D- a b \right] y(t) = x(t) \end{aligned}$$

with initial conditions \((Dy)(0) = y(0) = 0\) and assume that a and b are different constants. The corresponding convolution equation

$$\begin{aligned} (D\delta + a\delta ) *(D\delta - b\delta ) *y = x \end{aligned}$$

has as elementary solution the convolution product

$$\begin{aligned} e = (D\delta + a\delta )^{*-1} *(D\delta - b\delta )^{*-1} \end{aligned}$$

with partial fraction expansion (see Example 7.1)

$$\begin{aligned} e = \frac{1}{a + b} \left[ -(D\delta + a\delta )^{*-1} + (D\delta - b\delta )^{*-1} \right] . \end{aligned}$$

The inverse elements appearing in e were calculated in Example 7.2. Using those results we can express the elementary solution of the equation as

$$\begin{aligned} e(t) = \frac{1}{a + b} \left[ -\textsf{1}_{+}(t) \, e^{-a t} + \textsf{1}_{+}(t) \, e^{b t} \right] . \end{aligned}$$

If we Laplace transform the equation, the procedure is completely parallel. The Laplace transformed of the elementary solution is

$$\begin{aligned} E(s) = \frac{1}{a + b} \left[ \frac{-1}{s + a} + \frac{1}{s - b} \right] \end{aligned}$$

and by inversion we obtain the same distribution e.

We have seen that the initial value problem described by (7.7) and (7.8) can equivalently be described by the convolution (7.10). While the differential equation only has a meaning if x is a continuous function with isolated jump discontinuities, the convolution equation remain well-defined if \(\textsf{1}_{+}x\) is replaced by any distribution in \({\mathcal {D_+'}}\). In particular, we can consider more general convolution equations of the form

$$\begin{aligned} L \delta *y = N \delta *x + \sum _{k=0}^{m-1} \sigma _k D^k\delta \end{aligned}$$

with

$$\begin{aligned} N = b_nD^n + b_{n-1}D^{n-1} + \cdots + b_0\,, \end{aligned}$$

x any distribution in \({\mathcal {D_+'}}\) and where it’s understood that the solution y must belong to the convolution algebra of distributions in \({\mathcal {D_+'}}\). As before, the solution of the equation is found by convolving with the convolutional inverse of \(L\delta \)

$$\begin{aligned} y = (L \delta )^{*-1} *N \delta *x + \sum _{k=0}^{m-1} \sigma _k (L \delta )^{*-1} *D^k\delta \,. \end{aligned}$$

We want to establish if it’s possible to replace the second summand on the right-hand side, representing the initial conditions, by a suitably selected input signal composed by a weighted sum of a Dirac pulse and it’s derivatives, such that, in the complement of \(t=0\), the solution y remains unchanged. To this end its convenient to consider the Laplace transformed of y

$$\begin{aligned} Y(s) = \frac{Z(s)}{P(s)} X(s) + \frac{\sum _{k=0}^{m-1} \sigma _k s^k}{P(s)} \end{aligned}$$

with \(Z = {\mathcal {L}}\{N\delta \}\) a polynomial of degree n and the other symbols having the same meaning as before. The Laplace transform of the sought for input signal is a polynomial

$$\begin{aligned} X(s) = x_qs^q + \cdots + x_0 \end{aligned}$$

and it must be selected in such a way as to satisfy the equality

$$\begin{aligned} \frac{Z(s)}{P(s)} X(s) = \frac{\sum _{k=0}^{m-1} \sigma _k s^k}{P(s)} + W(s) \end{aligned}$$

with W(s) another polynomial. This polynomial corresponds also to a weighted sum of Dirac pulses and its derivatives, and hence only changes y at \(t=0\), which we allow.

The conditions for the existence of such an input signal X(s) can be determined with the help of the division theorem of polynomials. It states that, given polynomials Q(s) and \(P(s) \ne 0\), there are unique polynomials R(s) and W(s) satisfying

$$\begin{aligned} Q(s) = P(s) W(s) + R(s) \end{aligned}$$

with the degree of R(s) being lower than the one of P(s) [21]. From this theorem we deduce that, provided Z(s) and P(s) are relatively prime, that is, assuming that they have no common factors, we can select X(s) so that the rest of the division of Z(s)X(s) by P(s) corresponds to \(\sum _{k=0}^{m-1} \sigma _k s^k\). To achieve this we need m degrees of freedom, one for each \(\sigma _k\). In other words, the input polynomial X(s) must have degree \(m-1\). Then we can choose the coefficients of X(s) in such a way as to obtain the desired values for the rest of the division.

If Z(s) and P(s) have a common factor K(s) then

$$\begin{aligned} \begin{aligned} \frac{Z(s) X(s)}{P(s)} &= \frac{K(s) Z'(s) X(s)}{K(s) P'(s)} = \frac{K(s)}{K(s)} \left( \frac{R(s)}{P'(s)} + W(s) \right) \\ &= \frac{K(s) R(s)}{P(s)} + W(s) \end{aligned} \end{aligned}$$

and we see that the rest of the division K(s)R(s) has a constrained form that can’t be made to match \(\sum _{k=0}^{m-1} \sigma _k s^k\) for arbitrary \(\sigma _k\)s.

We have therefore established that, in a convolution equation derived from an initial value problem, the terms representing the initial conditions can be replaced by a distribution x composed by a weighted sum of a Dirac pulse and its derivatives if and only if Z(s) and P(s) have no common factors. If we perform this substitution, in the complement of \(t=0\), the solution of the equation y remains unchanged.

Example 7.4: Replacing Initial Conditions

Consider the initial value problem

$$\begin{aligned}\begin{gathered} \left( D^2 + a_1D+ a_0 \right) y = \left( b_1D+ b_0 \right) x\\ (Dy)(0) = y_1, \qquad y(0) = y_0\,. \end{gathered}\end{aligned}$$

The corresponding convolution equation is

$$\begin{aligned} (D\delta ^2 + a_1D\delta + a_0\delta ) *y = (b_1D\delta + b_0) *x + y_0D\delta + (a_1y_0 + y_1)\delta \,. \end{aligned}$$

Our objective is to replace the initial conditions by an input signal composed by a Dirac pulse and its derivatives so that in the complement of \(t=0\) the solution y of the convolution equation with this input signal is identical to the solution of the equation with initial conditions and no input signal.

Expressed in the Laplace domain the problem is thus to find the coefficients of the polynomial

$$\begin{aligned} X(s) = x_1s + x_0 \end{aligned}$$

such that

$$\begin{aligned} \frac{Z(s) X(s)}{P(s)} = \frac{R(s)}{P(s)} + W(s) \end{aligned}$$

with

$$\begin{aligned} Z(s) &= b_1s + b_0, & P(s) &= s^2 + a_1s + a_0, & R(s) &= y_0s + a_1y_0 + y_1 \end{aligned}$$

and W(s) an arbitrary polynomial of degree lower than 2. By performing the polynomial division of the left-hand side of the equation we obtain

$$\begin{aligned} \frac{s (- a_1 b_1 x_1 + b_0 x_1 + b_1 x_0) - a_0 b_1 x_1 + b_0 x_0}{s^2 + a_1s + a_0} + b_1 x_1\,. \end{aligned}$$

Thus \(W(s) = b_1 x_1\) and, by comparing coefficients of this expression with the right-hand side of the equation, the coefficients of X(s) are found to be

$$\begin{aligned} x_0 &= \frac{(a_1 b_1 - b_0) y_1 + [(a_1^2 - a_0) b_1 - a_1 b_0] y_0}{a_0 b_1^2 - a_1 b_0 b_1 + b_0^2}\\ x_1 &= \frac{b_1 y_1 + (a_1 b_1 - b_0) y_0}{a_0 b_1^2 - a_1 b_0 b_1 + b_0^2}\,. \end{aligned}$$

This solution is well-defined except when the denominator, which is the same for both \(x_1\) and \(x_0\), becomes zero. This happens when

$$\begin{aligned} a_1 = \frac{a_0 b_1}{b_0} + \frac{b_0}{b_1}. \end{aligned}$$

In this case the polynomial Z(s) becomes a factor of P(s)

$$\begin{aligned} s^2 + (\frac{a_0b_1}{b_0} + \frac{b_0}{b_1}) 2 + a_0 = (b_1 s + b_0) (\frac{1}{b_1} s + \frac{a_0}{b_0}) \end{aligned}$$

in accordance with our general treatment of the problem.

Before concluding this section we show the important fact mentioned before that the convolution algebra \({\mathcal {D_+'}}\) has no zero divisors. To see this, consider a test function \(\phi \) that is real-valued and positive everywhere on its support, for example \(\beta _\nu \) from Example 2.1. We call such a function a positive test function. In Chap. 3 we saw that every distribution can be represented as the limit of a sequence of indefinitely differentiable functions. Let \((g_m)\) and \((y_m)\) be such sequences converging to g and y respectively and, for simplicity, assume that all functions are real-valued. Then, for every m there exists an open interval U contained in the support of \(g_m\) where, for every positive test function \(\zeta \) with support in U, its value always has the same sign, for example positive

$$\begin{aligned} \langle g_m,\zeta \rangle > 0\,. \end{aligned}$$

We can make a similar construction for \(y_i\) as well. In addition, we can introduce a parameter \(\lambda \) such that

$$\begin{aligned} \lambda \mapsto \langle y_i(\tau ),\phi (\tau + \lambda ) \rangle \end{aligned}$$

is a positive (or negative) test functions of \(\lambda \) with support in U. Then, assuming again a positive sign,

$$\begin{aligned} \langle g_m *y_i,\phi \rangle = \langle g_m(\lambda ),\langle y_i(\tau ),\phi (\tau + \lambda ) \rangle \rangle \end{aligned}$$

must be positive for every m and i and, with the continuity of distributions and convolution, so must be the limit. Consequently, if \(g *y\) vanish for every test function then either g or y must be the zero distribution.

4 Integro-Differential Equations

Some initial value problems are naturally formulated as ordinary integro-differential equations

$$\begin{aligned} &{ D^m y(t) + a_{m-1}D^{m-1} y(t) + \cdots + a_1Dy(t) + a_0 y(t) } \\ &\,+ a_{-1} \int _0^t y(\tau _1) \, d\tau _1 + \cdots + a_{-n} \int _0^t \cdots \int _0^{\tau _{n-1}} y(\tau _n) \, d\tau _n \cdots d\tau _1 \\ &\,= x(t) \end{aligned}$$

with initial conditions

$$\begin{aligned} (D^k y)(0) = y_k \qquad k=0,\ldots ,m-1\,. \end{aligned}$$
(7.12)

We still need initial conditions, but this time only m of them as the remaining information is included in the integrals.

These problems can be converted into convolution equations in the convolution algebra \({\mathcal {D_+'}}\) in a similar way as we discussed before. The new terms are the ones that are expressed as integrals and these can be written as convolution products

$$\begin{aligned} \int _0^t y(\tau _1) \, d\tau _1 = & {} \textsf{1}_{+}(t) *\textsf{1}_{+}(t) y(t) \\ &\ldots &\\ \int _0^t \cdots \int _0^{\tau _{n-1}} y(\tau _n) \, d\tau _n \cdots d\tau _1 = & {} \textsf{1}_{+}^{*n}(t) *\textsf{1}_{+}(t) y(t)\,. \end{aligned}$$

The corresponding convolution equation is therefore

$$\begin{aligned} & \left( { D^m \delta + a_{m-1}D^{m-1} \delta + \cdots + a_1D\delta + a_0 \delta } \right. \\ &\qquad + \left. a_{-1} \textsf{1}_{+}+ \cdots + a_{-n} \textsf{1}_{+}^{*n} \right) *\textsf{1}_{+}(t) y(t) \\ &\,\,\, = \textsf{1}_{+}(t) x(t) + \sum _{k=0}^{m-1} \sigma _k D^k\delta \end{aligned}$$

with \(\sigma _k, k=0,\ldots ,m-1\) as defined in (7.9).

As we have seen, the convolution algebra \({\mathcal {D_+'}}\) is an integral domain. For this reason we can multiply both sides of the equation with a non-zero distribution without changing the result. If we choose \(D^n \delta \) as the distribution and make use of the fact that \(\textsf{1}_{+}(t)\) is the inverse of \(D\delta \)

$$\begin{aligned} D\delta *\textsf{1}_{+}= \delta \end{aligned}$$

the equation becomes

$$\begin{aligned} &{ \left( D^{m+n} \delta + a_{m-1}D^{m-1+n} \delta + \cdots a_{-n} \delta \right) *\textsf{1}_{+}(t) y(t) }\\ &= D^n \delta *\textsf{1}_{+}(t) x(t) + \sum _{k=0}^{m-1} \sigma _k D^{k+n}\delta \,. \end{aligned}$$

This is the type of convolution equation that we discussed in Sect. 7.3 and is solved by the same method. The solution of integro-differential equations thus requires no new technique.

The procedure of transforming the convolution equation that we just discussed is similar to the standard procedure used to convert an integro-differential equation into a differential equation by differentiating the equation. The key difference is that, while the former handles initial conditions automatically, the latter method requires extraction of additional conditions from the original equation.

5 Periodic Solutions

One is often interested in periodic solutions of differential equations. These solutions are most conveniently found with the help of the convolution algebra of periodic distributions.

Consider again the convolution equation obtained from the differential operator L of Sect. 7.3 where now the unit element of the algebra is the Dirac comb \(\delta _{\mathcal {T}}\)

$$\begin{aligned} L\delta _{\mathcal {T}}*y = x\,. \end{aligned}$$

In Sect. 4.5 we established two important properties of the Fourier series:

  1. 1.

    The first one being that the Fourier coefficients of the convolution product of two Fourier series is equal to the product of the coefficients of the individual series times the period (Eq. (4.20)).

  2. 2.

    The second one being the fact that differentiation corresponds to multiplication of the kth Fourier coefficient by the factor \(\jmath k \omega _c\) with \(\omega _c = 2\pi /{\mathcal {T}}\). For this reason the kth Fourier coefficient of the distribution \(L\delta _{\mathcal {T}}\) is proportional to a polynomial P evaluated at \(\jmath k \omega _c\)

    $$\begin{aligned} c_k(L\delta _{\mathcal {T}}) = & {} \left[ (\jmath k \omega _c)^m + a_{m-1}(\jmath k \omega _c)^{m-1} + \cdots + a_1(\jmath k \omega _c) + a_0 \right] \frac{1}{{\mathcal {T}}} \\ = & {} P(\jmath k \omega _c) \frac{1}{{\mathcal {T}}}\,. \end{aligned}$$

By representing both x and y by their respective Fourier series and using these two properties, we can transform the above convolution equation into algebraic equations for the Fourier coefficients. Let’s denote by \(c_k\) the kth Fourier coefficient of x and by \(d_k\) the one of y. Then the equation becomes

$$\begin{aligned} (\jmath k \omega _c - z_1)^{l_1} (\jmath k \omega _c - z_2)^{l_2} \cdot \cdots \cdot (\jmath k \omega _c - z_n)^{l_n} \, d_k = c_k \end{aligned}$$

where, as before, we have expressed the polynomial P by its zero factors. To solve the equation we have to distinguish three cases:

  1. 1.

    If one or more zeros \(z_j\) of the polynomial equals \(\jmath k \omega _c\) for some integer k and the coefficient \(c_k\) of x is different from zero, then the equation has no solution.

  2. 2.

    If one or more zeros \(z_j\) of the polynomial equals \(\jmath k \omega _c\) for some integer k and the coefficient \(c_k\) is zero, then the equation has an infinity of solutions. In fact in this case \(d_k\) can be any number. Note also that if \(z_j\) is equal to \(\jmath k \omega _c\) then the convolution product

    $$\begin{aligned} L\delta _{\mathcal {T}}*e^{\jmath k \omega _c t} = 0 \end{aligned}$$

    vanishes, which means that the convolution algebra of periodic distributions has zero divisors.

  3. 3.

    If no zero \(z_j\) equals \(\jmath k \omega _c\) for any value of k then the equation has the unique solution given by the Fourier series with coefficients

    $$\begin{aligned} d_k = \frac{c_k}{P(\jmath k \omega _c)}\,. \end{aligned}$$

Example 7.5: Cont. of Example 7.3

We look for a periodic solution of the convolution equation of Example 7.3

$$\begin{aligned} (D\delta _{\mathcal {T}}+ a\delta _{\mathcal {T}}) *(D\delta _{\mathcal {T}}- b\delta _{\mathcal {T}}) *y = x \end{aligned}$$

assuming that the real part of a and b are both positive. In particular, we are interested in the elementary solution e of the equation. By setting \(x = \delta _{\mathcal {T}}\) and expanding it by its Fourier series we obtain the following equation for the kth Fourier coefficient of e

$$\begin{aligned} e_k = \frac{1}{{\mathcal {T}}} \cdot \frac{1}{(\jmath k \omega _c + a) (\jmath k \omega _c - b)}\,. \end{aligned}$$

By performing a partial fraction expansion and with the help of (4.24), we recognize them as the coefficients of the Fourier series of the distribution

$$\begin{aligned} e(t) = g(t) *\delta _{\mathcal {T}}\end{aligned}$$

with

$$\begin{aligned} g(t) = \frac{-1}{a + b} \left[ \textsf{1}_{+}(t) \, e^{-a t} + \textsf{1}_{+}(-t) \, e^{b t} \right] \,. \end{aligned}$$

In fact the Fourier transform of g is

$$\begin{aligned} \hat{g}(\omega ) = & {} \frac{1}{a + b} \left[ \frac{-1}{\jmath \omega + a} + \frac{1}{\jmath \omega - b} \right] \\ = & {} \frac{1}{(\jmath \omega + a) (\jmath \omega - b)}\,. \end{aligned}$$

Note that g is a distribution of slow growth. The elementary solution of the equation in the algebra of periodic distributions is therefore the sum of periodically shifted tempered solutions of the differential equation.

Suppose now that we are interested in the solution for \(x(t) = A e^{\jmath \omega _c t}\). The only Fourier coefficient of x different from zero is \(c_1 = A\). The Fourier coefficients of y are then also all zero except for

$$\begin{aligned} d_1 = {\mathcal {T}}\, c_1 \, e_1 = A \, \hat{g}(\omega _c)\,. \end{aligned}$$

In this case the solution y of the equation is therefore

$$\begin{aligned} y(t) = A \, \hat{g}(\omega _c) \, e^{\jmath \omega _c t}\,. \end{aligned}$$

6 General Convolution Equations

6.1 General Solutions

In this section we consider generic convolution equations of the form

$$\begin{aligned} g *y = x \end{aligned}$$

with gy and x generic distributions in \({\mathcal {D}}'\). Here the situation is different from when working in a convolution algebra. First the convolution between g and y may not exit. To guarantee its existence g must have compact support. This includes many important cases, for example, all linear differential operators with constant coefficients.

Second, g may not have an inverse. For example, if \(g\in {\mathcal {D}}\) we have seen in Sect. 3.2 that \(g *y \in {\mathcal {E}}\) and so can’t equal \(\delta \) for any \(y\in {\mathcal {D}}'\). If it does, then the equation has an elementary solution, but it only serves to find solutions for x having compact support, otherwise the last convolution in

$$\begin{aligned} y = g^{*-1} *g *y = g^{*-1} *x \end{aligned}$$

may not make sense.

Further, the homogeneous equation

$$\begin{aligned} g *y = 0 \end{aligned}$$

may have solutions different from \(y=0\). For this reason there may be an infinity of elementary solution, two of them differing by a solution of the homogeneous equation.

Despite these facts, general convolution equations have many practical applications.

Example 7.6: Electrostatics

Let \(\rho \) denote the electric charge density and u the electrostatic potential, both functions of the position in space. In empty space the two quantities are related by Poisson’s equation

$$\begin{aligned} \Delta u(x) = -\frac{\rho (x)}{\epsilon _0} \end{aligned}$$

with \(\Delta \) the Laplace operator, \(x \in {\mathbb {R}}^3\) the vector specifying position and \(\epsilon _0\) the permittivity of free space. This equation can be written as a convolution equation

$$\begin{aligned} \Delta \delta *u = -\frac{\rho (x)}{\epsilon _0}\,. \end{aligned}$$

One can show that the inverse of \(\Delta \delta \) is

$$\begin{aligned} -\frac{1}{4\pi \, |x|}\,. \end{aligned}$$

If the charge density \(\rho \) is distributed over a finite region \(\Omega \subset {\mathbb {R}}^3\) then the generated potential is

$$\begin{aligned} u(x) = \frac{1}{4\pi \epsilon _0 \, |x|} *\rho (x)\,. \end{aligned}$$

The homogeneous equation has solutions different from the trivial one: the so-called harmonic functions.

6.2 Tempered Solutions

If x is tempered and one is interested in tempered solutions of the equation then the convolution equation has a sense not only for g of compact support, but for the larger class of distributions of rapid descent \({\mathcal {O}}'_C\) [16]. This case is particularly important because one can then use the Fourier transform which may make it easier to find a solution.

In the following we briefly consider the one dimensional case where g is a linear differential operator with constant coefficients L and the convolution equation has the form

$$\begin{aligned} L\delta *y = x. \end{aligned}$$

In this case there always is at least an elementary solution. If we Fourier transform both sides of the equation we find the equivalent equation

$$\begin{aligned} P\,\hat{y} = \hat{x} \end{aligned}$$

with P a polynomial (and thus in \({\mathcal {O}}_M\)).

If P has no zeros, then the only solution of the homogeneous equation is the trivial one and the inverse of P is a function of slow growth \(1/P \in {\mathcal {O}}_M\). The only elementary solution of the equation is therefore the summable distribution

$$\begin{aligned} e = {\mathcal {F}}^{-1}\{\frac{1}{P}\}\,. \end{aligned}$$

If P has a zero at \(\omega _p\) then the homogeneous equation has nontrivial solutions. In particular, we saw Sect. 2.5.1 that if the multiplicity of the zero is k then the sums

$$\begin{aligned} \sum _{m=0}^{k-1} c_m\,D^m\delta (\omega - \omega _p) \end{aligned}$$

with \(c_m\) constants, are all solutions of the Fourier transformed homogeneous equation \(P\,\hat{y} = 0\). The solutions of the original homogeneous equation are found by inverse Fourier transformation to be

$$\begin{aligned} \sum _{m=0}^{k-1}\frac{c_m}{2\pi }\,(-\jmath t)^m\, e^{\jmath \omega _pt}\,. \end{aligned}$$

The equation has therefore an infinity of elementary solutions. In addition, since \(1/P\not \in {\mathcal {O}}_M\), the solutions are not summable distributions.

Note that the equation may have non-tempered solutions that are not captured by Fourier transform techniques.

Example 7.7: Cont. of Example 7.3

We look for a tempered solution of the convolution equation of Example 7.3

$$\begin{aligned} (D\delta + a\delta ) *(D\delta - b\delta ) *y = x \end{aligned}$$

assuming that the real part of a and b are both positive. A tempered elementary solution is easily found by solving the Fourier transformed the equation

$$\begin{aligned} \hat{e}(\omega ) = \frac{1}{P(\omega )} = \frac{1}{(\jmath \omega + a) (\jmath \omega - b)}\,. \end{aligned}$$

and determining its inverse

$$\begin{aligned} e(t) = \frac{-1}{a + b} \left[ \textsf{1}_{+}(t) \, e^{-a t} + \textsf{1}_{+}(-t) \, e^{b t} \right] \,. \end{aligned}$$

Note that despite the similarity between \(\hat{e}(\omega )\) and E(s) of Example 7.3 the tempered elementary solution is different from the solution found in the convolution algebra \({\mathcal {D'}}_+\).

Since \(P(\omega )\) has no zeros, e is the only elementary tempered solution of the equation. Other solutions obtained by adding any linear combination of the solutions of the homogeneous equation (\(e^{-a t}\) and \(e^{b t}\)) growth exponentially as t tends either to \(\infty \) or to \(-\infty \) and are therefore not tempered distributions.

7 Systems of Convolution Equations

One often has to solve a set of n simultaneous equations in n unknown distributions \(y_1,\ldots ,y_n\)

$$\begin{aligned} \begin{matrix} g_{11} *y_1 + g_{12} *y_2 + \cdots + g_{1n} *y_n &{}=&{} x_1\\ g_{21} *y_1 + g_{22} *y_2 + \cdots + g_{2n} *y_n &{}=&{} x_2\\ \vdots &{} \vdots &{} \vdots \\ g_{n1} *y_1 + g_{n2} *y_2 + \cdots + g_{nn} *y_n &{}=&{} x_n \end{matrix} \end{aligned}$$

with \(g_{jm}\) coefficients distributions, \(x_1,\ldots ,x_n\) right-hand side distributions and where all distributions belong to a distribution algebra \({\mathcal {A}}'\). This system of equations can conveniently be written in matrix form

$$\begin{aligned} G *Y = X \end{aligned}$$
(7.13)

with G the \(n \times n\) matrix with elements \(g_{jm}\) and YX vector valued distributions (column matrices) with elements \(y_j\) and \(x_j\) respectively. The space of vector valued distributions is denoted by \({\mathcal {D'}}({\mathbb {R}}^m,{\mathbb {C}}^n)\) and application of a test function \(\phi \in {\mathcal {D}}({\mathbb {R}}^m)\) to a vector X is defined as the application of \(\phi \) to each component individually

$$\begin{aligned} \langle X,\phi \rangle :=\begin{bmatrix} \langle x_1,\phi \rangle \\ \vdots \\ \langle x_n,\phi \rangle \end{bmatrix}\,. \end{aligned}$$

The determinant of the matrix G is defined as usual, with the convolution product replacing the standard product. It is a convolution belonging to the convolution algebra \({\mathcal {A}}'\). For example, the determinant of a \(2 \times 2\) matrix G is

$$\begin{aligned} \det \begin{bmatrix} g_{11} &{} g_{12}\\ g_{21} &{} g_{22} \end{bmatrix} = g_{11} *g_{22} - g_{21} *g_{12}\,. \end{aligned}$$

Suppose that the matrix G has an inverse \(G^{*-1}\)

$$\begin{aligned} G *G^{*-1} = \delta I \end{aligned}$$

where \(\delta I\) is the identity matrix with the unit of \({\mathcal {A}}'\) on the diagonal and 0 everywhere else. If we compute the determinant of both sides of this equation we obtain

$$\begin{aligned} \det (G *G^{*-1}) = \det (G) *\det (G^{*-1}) = \det (\delta I) = \delta \end{aligned}$$

from which we deduce that, if the matrix G has an inverse then \(\det (G)\) has an inverse in \({\mathcal {A}}'\). Conversely, if \(\det (G)\) has an inverse, then we can compute the inverse of G by

$$\begin{aligned} G^{*-1} = \det (G)^{*-1} *\tilde{G}^T \end{aligned}$$

with \(\tilde{G}\) the matrix of cofactors and \(\tilde{G}^T\) its transpose.

We conclude that (7.13) has a solution for arbitrary right-hand side X if and only if \(\det (G)\) has an inverse in \({\mathcal {A}}'\). The solution is given by

$$\begin{aligned} Y = G^{*-1} *X\,. \end{aligned}$$
(7.14)

One shows in a similar way as for a single equation (see Sect. 7.2) that \(G^{*-1}\) and hence the solution of the equation is unique.