1 Introduction

We consider semilinear damped wave equations with damping term, structural (visco-elastic) damping term, and mass term

$$\begin{aligned} \partial _{tt}^2u&- \alpha \Delta u - \beta \Delta (\partial _tu)+ \gamma \partial _tu + \delta u = g(u) + h(\partial _tu), \quad (x,t) \in \Omega \times (0,T), \end{aligned}$$
(1a)
$$\begin{aligned} u|_{\partial \Omega }&= 0, \quad t \in (0, +\infty ), \end{aligned}$$
(1b)
$$\begin{aligned} u(0,x)&= p(x), \quad \partial _tu(0,x) = q(x), \quad x \in \Omega \end{aligned}$$
(1c)

on a bounded and open domain \(\Omega \subset {\mathbb {R}}^d\) with smooth compact boundary \(\partial \Omega \). The term \(\Delta (\partial _tu)\) is the structural (visco-elastic) damping while the term \(\partial _tu\) is the damping term. We assume that \(\beta , \gamma \), and \(\delta \) are three non-negative coefficients. Moreover, the coefficient \(\alpha \) must be positive. The initial data p and q are chosen from the usual energy space \((p, q) \in H_0^1(\Omega ) \times L^2(\Omega ) \). Concerning the nonlinear term, we recall some particular equations from the literature:

  1. (i)

    the perturbed sine-Gordon equation (see [4, 13, 27])

    $$\begin{aligned} \partial _{tt}^2u&- \alpha \Delta u - \beta \Delta (\partial _tu) + \gamma \partial _tu = \sin u; \end{aligned}$$
    (2)
  2. (ii)

    the perturbed wave equation of quantum mechanics (see [4, 13, 27])

    $$\begin{aligned} \partial _{tt}^2u&- \alpha \Delta u - \beta \Delta (\partial _tu) =- |u|^{q} u - |\partial _tu|^{p} (\partial _tu), \quad p,q \ge 0. \end{aligned}$$
    (3)

Another type of second-order in time PDE is the Euler–Bernoulli beam equation with Kelvin–Voigt damping

$$\begin{aligned} \partial _{tt}^2u&+ \partial _{xx}^2(\alpha \partial _{xx}^2u + \beta \partial _{xx}^2(\partial _tu)) + \gamma \partial _tu + \delta u = g(u), \quad (x,t) \in (0, L) \times (0, T), \end{aligned}$$
(4a)
$$\begin{aligned} u(0,x)&= p(x), \quad \partial _tu(0,x) = q(x), \quad x \in (0,L), \end{aligned}$$
(4b)

where u(tx) denotes the deflection of the beam of its rigid body motion at time t and position x. For given parameters \(\alpha >0\) and \(\beta \ge 0\), the moment function is

$$\begin{aligned} m(t,x) = \alpha \partial _{xx}^2u(t,x) + \beta \partial _{xx}^2(\partial _tu)(t,x). \end{aligned}$$

The first derivative of the moment m(tx) with respect to the variable x represents the shear force. The following boundary conditions will be considered, where \(\xi \in \{0, L\}\):

  1. (a)

    Hinged end: \(u(t,\xi ) = 0,~~m(t,\xi ) = 0\).

  2. (b)

    Clamped end: \(u(t,\xi ) = 0,~~~\partial _xu(t,\xi ) = 0\).

  3. (c)

    Free end: \(m(t,\xi ) = 0,~~~\partial _xm(t,\xi ) = 0\).

  4. (d)

    Sliding end: \(\partial _xu(t,\xi ) = 0,~~~m(t,\xi ) = 0\).

Depending on the set up of the beam model, various combinations of boundary conditions are of interest, for example: hinged-hinged boundary conditions

$$\begin{aligned} u(t,0) = 0, \quad m(t,0) = 0, \quad u(t,L) = 0, \quad m(t,L) = 0. \end{aligned}$$

Concerning semilinear beam equations, in [1, 10, 11], a nonlinear term \(g(u) = -lu^3,~~ l >0\) was used when the authors considered a railway track model.

Both problems (1) and (4) can be rewritten as abstract ordinary differential equations in a product space \(X = H \times L^2(\Omega )\) by denoting a new variable \(y(t,x) = (u(t,x), w(t,x))'\) as follows

$$\begin{aligned} \dot{y}(t) = {\mathcal {A}}y(t) + {\mathcal {F}}(y(t)), \end{aligned}$$

where

$$\begin{aligned}&{\mathcal {A}}= {\begin{bmatrix} 0 &{} \quad I \\ -\alpha (-\Delta ) - \delta I &{} \quad -\beta (-\Delta ) - \gamma I\end{bmatrix}} \quad \text {for (1)}, \\&{\mathcal {A}}= {\begin{bmatrix}0 &{} \quad I \\ -\alpha \partial _{xxxx}^4- \delta I &{} \quad -\beta \partial _{xxxx}^4- \gamma I \end{bmatrix}} \quad \text {for (4)}, \\ \end{aligned}$$

and

$$\begin{aligned} \mathcal {F}(u,w) = {\begin{bmatrix}0 \\ g(u) + h(w)\end{bmatrix}}. \end{aligned}$$

The space H and the domain of the operator \({\mathcal {A}}\) will be chosen to be consistent with the boundary conditions. Here and henceforth, the transpose of a matrix E is denoted by \(E'\).

These types of equations have been studied extensively in many fields of mathematics. For damped wave equations, see [4, 5, 13, 20, 27, 30]; for Euler–Bernoulli beam equations, see [1, 2, 10, 11, 21, 23, 25, 26, 28, 29] and references therein. The time discretization of these equations, to the best of our knowledge, is usually carried out by standard integration schemes such as Runge–Kutta methods or multistep methods. In this article, we will consider exponential integrators to solve this class of PDEs. By spatial discretization of (1) or of (4), we get a semi-discretization of the equation in matrix form

$$\begin{aligned} \dot{y}(t) = A y(t) + F(y(t)), \quad y(0) = y_0 = (p, q)', \end{aligned}$$
(5)

where

$$\begin{aligned} A = {\begin{bmatrix} 0 &{} \quad I \\ -\alpha S - \delta I &{} \quad -\beta S - \gamma I \end{bmatrix}}. \end{aligned}$$
(6)

and the square matrix S is the discretized version of the operator \((-\Delta )\) or \(\partial _{xxxx}^4\). The linear part of (5)

$$\begin{aligned} \dot{y}(t) = A y(t) , \quad y(0) = y_0 = (p, q)', \end{aligned}$$
(7)

can be solved exactly, e.g.,

$$\begin{aligned} y(t) = {\mathrm {e}}^{tA} y_0, \quad t >0. \end{aligned}$$
(8)

For the undamped wave equations (i.e. \(\beta = \gamma = 0\) in (6)), by using the matrix sine and matrix cosine functions, the explicit form of the matrix exponential \({\mathrm {e}}^{tA}\) is easily obtained (see [19, Section 3.2]). Based on this formula, Gautschi (in [12]) and Deuflhard (in [8]) developed a number of schemes to tackle semilinear second-order differential equations. Nevertheless, when damping terms appear in (6), a direct approach to compute the matrix exponential \({\mathrm {e}}^{tA}\) is more involved and not yet discussed in the literature. Therefore, in this paper, we firstly present an approach to exactly evaluate the matrix exponential of (6).

Let us briefly explain our procedure to compute the matrix exponential. We start by employing two linear transformations to represent the matrix A as \(A = \widetilde{Q}PCP' \widetilde{Q}'\) where the new matrix C is a block diagonal matrix, i.e. \(C = {\mathrm{diag}}(G_1,\cdots ,G_n)\). Each block \(G_i\) is a \(2 \times 2\) matrix. The exponential of such a matrix \(G_i\) will be computed explicitly. Regarding its eigenvalues, a suitable formula will be constructed. In this way the matrix exponential \({\mathrm {e}}^{tA}\) can be computed cheaply even for large values of t. We also discuss the cases \(\beta = \gamma = \delta = 0\) and \(\beta , \gamma \ll \alpha \) (see [21, Section 3] for typical physical parameters). In both cases, the matrix \(G_i\) has usually two conjugate complex eigenvalues. To reduce the computation cost, we avoid complex arithmetic. The exact matrix exponential will not only give us a huge advantage to solve the class of linear damped wave equations or linear beam equations but also be valuable in computing solutions of semilinear problems. The numerical schemes for the full equation (18) were constructed by incorporating the exact solution of (7) in an appropriate way. In the literature, these methods were investigated by many authors (see, e.g., [7, 9, 17,18,19, 22, 24, 31]). To employ these known exponential integrators, the core point is the computation of related matrix functions \(\varphi _k(tA)\). As for the matrix exponential, we will use two linear transformations and compute the action of the matrix functions \(\varphi _k(tG_i)\). Explicit formulas will be established in the same way as for computing the matrix exponential \({\mathrm {e}}^{tG_i}\). Concerning the computation of matrix functions, we refer to the review work by Higham and Al-Mohy [16] as well as the monograph by Higham [15].

The outline of the paper is as follows. We start with the discussion of computing the matrix exponential \({\mathrm {e}}^{tA}\) in Sect. 2. Two linear transformations P and Q will be presented. The computations of the matrix exponential \({\mathrm {e}}^{tG_i}\) will be discussed for three different cases. In simulations, instead of computing the matrix exponential, we will rather compute its action on a given vector. A detailed instruction will be presented in Remark 2.7. In Sect. 3 we recall some exponential integrators and discuss an approach to compute the action of the related matrix functions \(\varphi _k(tA)\). The procedure will be summarized in Sect. 3.3. In Sect. 4, we will present some numerical examples of semilinear equations. The operators \((-\Delta )\) and \(\partial _{xxxx}^4\) will be discretized by finite differences. We will use exponential integrators for the time integration of these examples. Some comparisons with standard integrators will be presented in Sect. 4.3 to clarify the efficiency of our approach.

2 Exact Matrix Exponential

In this section, we propose an approach to compute efficiently the matrix exponential \({\mathrm {e}}^{tA}\) for a matrix A of the form (6). With this at hand, the solution of linear system (7) can be evaluated for an arbitrary time \(t>0\) in a fast and reliable way.

2.1 Two Linear Transformations

The key idea is to transform A to a simple block-diagonal matrix for which the exponential can be computed cheaply.

Lemma 2.1

Assume that there exist an orthogonal matrix Q and a diagonal matrix \(D = {\mathrm{diag}}\{\lambda _1,\ldots , \lambda _n\}\) such that \(S = QDQ'\), then the matrix A of form (6) can be transformed to the block form

$$\begin{aligned} B = {\begin{bmatrix}0 &{} \quad I \\ D_1 &{} \quad D_2\end{bmatrix}}, \end{aligned}$$
(9)

where \(D_1\) and \(D_2\) are two diagonal matrices.

Proof

By substituting \(S = QDQ'\) and \(QQ' = I\) into (6), we get that

$$\begin{aligned} A&= {\begin{bmatrix} 0 &{} \quad QQ' \\ -\alpha QDQ' - \delta QQ' &{} \quad -\beta QDQ' - \gamma QQ'\end{bmatrix}} \\&= {\begin{bmatrix}Q &{} \quad 0 \\ 0 &{} \quad Q\end{bmatrix}} {\begin{bmatrix}0 &{} \quad I \\ -\alpha D - \delta I &{} \quad - \beta D - \gamma I \end{bmatrix}} {\begin{bmatrix}Q' &{} \quad 0 \\ 0 &{} \quad Q'\end{bmatrix}}. \end{aligned}$$

The proof is complete by identifying two diagonal matrices \(D_1 = -\alpha D - \delta I\) and \(D_2 = -\beta D - \gamma I\). \(\square \)

Lemma 2.2

Consider \(P \in {\mathbb {R}}^{2n \times 2n}\) the permutation matrix satisfying

$$\begin{aligned} P_{i,2i-1}&= 1, \quad P_{i+n,2i} = 1 \quad \text { for } 1 \le i \le n, \quad P_{k,l} = 0 \text { else}. \end{aligned}$$
(10)

The matrix B given in (9) can be transformed under the permutation P to a block diagonal matrix C, i.e. \(B = PCP'\), where

$$\begin{aligned} C = {\begin{bmatrix} G_1 &{} \quad 0 &{} \quad \cdots &{} \quad 0 \\ 0 &{} \quad G_2 &{} \quad \cdots &{} \quad 0 \\ \vdots &{} \quad \vdots &{} \quad \ddots &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad \cdots &{} \quad G_n \end{bmatrix}} \quad \text {with} \quad G_i = {\begin{bmatrix}0 &{} \quad 1 \\ -\alpha \lambda _i - \delta &{} \quad -\beta \lambda _i - \gamma \end{bmatrix}}. \end{aligned}$$

Proof

Following the definitions of the matrices B and C, for \(1 \le i \le n\) we have

$$\begin{aligned} B_{i, n+i}&=1,\quad {B_{n+i,i} = -\alpha \lambda _i-\delta },\quad B_{n+i, n+i} = -\beta \lambda _i - \gamma , \\ C_{2i-1, 2i}&=1,\quad {C_{2i, 2i-1} = -\alpha \lambda _i-\delta },\quad C_{2i, 2i} = -\beta \lambda _i - \gamma . \end{aligned}$$

We will prove that \(B = P C P'\). Indeed, for \(1 \le i \le n\), we have

$$\begin{aligned} (P C P')_{n+i, n+i}&= P_{n+i,2i} (C P')_{2i, n+i} = (C P')_{2i, n+i} = C_{2i,2i}P'_{2i,n+i} \\&= C_{2i,2i} P_{i+n,2i} = C_{2i,2i} = -\beta \lambda _i - \gamma = B_{n+i, n+i}, \\ (P C P')_{i, n+i}&= P_{i,2i-1} (C P')_{2i-1, n+i} = (CP')_{2i-1, n+i} = C_{2i-1,2i} P'_{2i,n+i} \\&= C_{2i-1,2i} = 1 = B_{i,n+i}, \\ (P C P')_{n+i, i}&= P_{n+i,2i} (CP')_{2i, i} = (C P')_{2i, i} = C_{2i,2i-1} P'_{2i-1,i} \\&= C_{2i,2i-1} = -\alpha \lambda _i - \delta = B_{n+i, i}. \end{aligned}$$

We will not be concerned with the remaining elements of B and C since they are all zero. Thus, the proof is complete. \(\square \)

Example 2.3

For \(n = 2\) and \(n = 3\) the permutation matrices P have the following form

$$\begin{aligned} P_2 = {\begin{bmatrix}1 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 \\ 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 1\end{bmatrix}}, \quad P_3 = {\begin{bmatrix}1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 \\ 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 1\end{bmatrix}}. \end{aligned}$$

Next, we recall some important properties of matrix functions (see [15, Theorem 1.13] or [16, Theorem 2.3]).

Theorem 2.4

Let \(A \in {\mathbb {C}}^{n \times n}\) and f be defined on the spectrum of A. Then

  1. (a)

    f(A) commutes with A;

  2. (b)

    \(f(A') = f(A)' \);

  3. (c)

    \(f(XAX^{-1}) = Xf(A) X^{-1}\).

  4. (d)

    The eigenvalues of f(A) are \(f(\lambda _i)\), where \(\lambda _i\) are the eigenvalues of A.

  5. (e)

    If \(A = (A_{ij})\) is block triangular then \(F=f(A)\) is block triangular with the same block structure as A, and \(F_{ii} = f(A_{ii})\).

  6. (f)

    If \(A = {\mathrm{diag}}(A_{11}, A_{22}, \dots , A_{mm})\) is block diagonal then

    $$\begin{aligned} f(A) = {\mathrm{diag}}(f(A_{11}), f(A_{22}), \dots , f(A_{mm})). \end{aligned}$$

A direct consequence of this theorem is the following result.

Theorem 2.5

Assume that there exist an orthogonal matrix Q and a diagonal matrix \(D = {\mathrm{diag}}\{\lambda _1, \cdots , \lambda _n\}\) such that \(S = Q D Q'\). Then, for \(t >0\), the exponential of the matrix tA is computed as follows

$$\begin{aligned} {\mathrm {e}}^{tA} = {\begin{bmatrix}Q &{} \quad 0 \\ 0 &{} \quad Q\end{bmatrix}} P {\begin{bmatrix} {\mathrm {e}}^{tG_1} &{} \quad 0 &{} \quad \cdots &{} \quad 0 \\ 0 &{} \quad {\mathrm {e}}^{tG_2} &{} \quad \cdots &{} \quad 0 \\ \vdots &{} \quad \vdots &{} \quad \ddots &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad \cdots &{} \quad {\mathrm {e}}^{tG_n} \end{bmatrix}} P' {\begin{bmatrix}Q' &{} \quad 0 \\ 0 &{} \quad Q'\end{bmatrix}}, \end{aligned}$$
(11)

where \(P \in {\mathbb {R}}^{2n \times 2n}\) is defined by (10) and \(G_i = {\begin{bmatrix}0 &{} \quad 1 \\ -\alpha \lambda _i - \delta &{} \quad -\beta \lambda _i - \gamma \end{bmatrix}}\).

Proof

The two Lemmas 2.1 and 2.2 imply that

$$\begin{aligned} A = {\begin{bmatrix}Q &{} \quad 0 \\ 0 &{} \quad Q\end{bmatrix}} B {\begin{bmatrix}Q' &{} \quad 0 \\ 0 &{} \quad Q'\end{bmatrix}} = {\begin{bmatrix}Q &{} \quad 0 \\ 0 &{} \quad Q\end{bmatrix}} P {\begin{bmatrix} {G_1} &{} \quad 0 &{} \quad \cdots &{} \quad 0 \\ 0 &{} \quad {G_2} &{} \quad \cdots &{} \quad 0 \\ \vdots &{} \quad \vdots &{} \quad \ddots &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad \cdots &{} \quad {G_n} \end{bmatrix}} P' {\begin{bmatrix}Q' &{} \quad 0 \\ 0 &{} \quad Q'\end{bmatrix}}. \end{aligned}$$

Formula (11) is proved by using the properties (c) and (f) in Theorem 2.4. \(\square \)

Remark 2.6

We need to compute the matrix exponential of the small matrices \(tG_i\). This will be presented in the next section. The exponential matrix \({\mathrm {e}}^{tG_i}\) can be computed explicitly by using formula (14), (15), or (16) depending on the sign of \((\beta \lambda _i + \gamma )^2-4(\alpha \lambda _i + \delta )\).

Remark 2.7

In practical situations, to reduce the computational cost, we will compute the action of the matrix exponential to a vector instead of computing it explicitly. In (11), P and Q are two square matrices of orders 2n and n, respectively. Since P is a permutation matrix, it can be stored, however, as a column matrix with n entries by indicating the positions of the non-zero elements, for example:

$$\begin{aligned} P_3 = {\begin{bmatrix}1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 \\ 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 1\end{bmatrix}} \rightarrow {\begin{bmatrix}1 \\ 3 \\ 5 \\ 2 \\ 4 \\ 6\end{bmatrix}},\quad P'_3 = {\begin{bmatrix}1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 1\end{bmatrix}} \rightarrow {\begin{bmatrix}1 \\ 4 \\ 2 \\ 5 \\ 3 \\ 6\end{bmatrix}} \end{aligned}$$

The block matrix \([{\mathrm {e}}^{tG_i}]\) can be stored as a \(2 \times 2n\) matrix. Given a compound vector \(v_0 = {\begin{bmatrix}a, b\end{bmatrix}}'\), where a and b are two column vectors with n entries, we start by evaluating a new vector \(v_1 = [Q'a, Q'b]'\). Next, the action of the permutation matrix \(P'\) to the vector \(v_1\) is the reorder of its entries to get a new vector \(v_2\). Then we compute the multiplication of the block exponential matrix with the vector \(v_3\) by cheaply multiplying each block \(2 \times 2\) matrix \({\mathrm {e}}^{tG_i}\) with two corresponding elements of \(v_2\). Analogously applying P and Q, we get an exact valuation of the action of the matrix exponential to an arbitrary vector.

2.2 The Matrix Exponential \(G_i\)

From (11), instead of evaluating the matrix exponential of \(A \in {\mathbb {R}}^{2n \times 2n}\), we need to compute the matrix exponential of each \(G_i \in {\mathbb {R}}^{2 \times 2}\). In this section, we give some explicit formulas. For simplification, we omit the index i.

Theorem 2.8

Assume that f is an analytic function. For a \(2 \times 2\) matrix G, the matrix function f(G) can be computed explicitly as

$$\begin{aligned} f(G) = \frac{f(z_1) - f(z_2)}{z_1-z_2} G + \frac{z_1 f(z_2) - z_2 f(z_1)}{z_1-z_2} I, \end{aligned}$$
(12)

where \(z_1\) and \(z_2\) are the two distinct eigenvalues of the matrix G. In case the matrix G has a double eigenvalue \(z_1\), we get

$$\begin{aligned} f(G) = f'(z_1) G + \left( f(z_1) - f'(z_1) z_1 \right) I. \end{aligned}$$
(13)

Proof

Let p(z) be the characteristic polynomial of the matrix G and assume for a moment that the equation \(p(z) = 0\) has two distinct roots \(z_1\) and \(z_2\). The Cayley–Hamilton theorem states that \(p(G) = 0\).

The function f can be rewritten in the form \(f(z) = q(z) p(z) + r(z)\) where q(z) is some quotient and r(z) is a remainder polynomial with \(0 \le \deg r(z) <2\). From \(p(G) = 0\), we obtain

$$\begin{aligned} f(G) = r(G) = d_1 G + d_0 I. \end{aligned}$$

To complete the proof, we determine the coefficients \(d_1\) and \(d_0\). From \(f(z_1) = r(z_1)\) and \(f(z_2) = r(z_2)\), we obtain that

$$\begin{aligned} d_1 = \frac{f(z_1) - f(z_2)}{z_1 - z_2}, \quad {d_0} = \frac{z_1 f(z_2) - z_2 f(z_1)}{z_1-z_2}. \end{aligned}$$

In case of a double eigenvalue \(z_1\), we use the conditions \(f(z_1) = r(z_1)\) and \(f'(z_1) = r'(z_1)\). As a consequence, we obtain that \(d_1 = f'(z_1)\) and \(d_0 = f(z_1) - f'(z_1) z_1\). \(\square \)

We remark that similar formulas can be found in the work of Bernstein and So [3] or Cheng and Yau [6]. To reduce the computational cost, we try to avoid complex arithmetic.

Lemma 2.9

Assume that the matrix G is of the form

$$\begin{aligned} G = {\begin{bmatrix}0 &{} \quad 1 \\ -\alpha \lambda - \delta &{} \quad -\beta \lambda - \gamma \end{bmatrix}} \end{aligned}$$

and denote \(m = -\frac{1}{2} (\beta \lambda + \gamma )\).

  1. (i)

    If \((\beta \lambda + \gamma )^2 > 4 (\alpha \lambda + \delta )\), denoting \(n = \frac{1}{2} \sqrt{(\beta \lambda + \gamma )^2-4 (\alpha \lambda + \delta ) } \), the exponential matrix \({\mathrm {e}}^{tG}\) can be computed explicitly as follows

    $$\begin{aligned} {\mathrm {e}}^{tG} = \frac{{\mathrm {e}}^{t(m+n)}-{\mathrm {e}}^{t(m-n)}}{2n}{\begin{bmatrix} -m-n &{} \quad 1 \\ n^2 - m^2 &{} \quad m-n \end{bmatrix}} + {\mathrm {e}}^{t(m+n)} I. \end{aligned}$$
    (14)
  2. (ii)

    If \((\beta \lambda + \gamma )^2 = 4 (\alpha \lambda + \delta )\), we obtain that

    $$\begin{aligned} {\mathrm {e}}^{tG} = {\mathrm {e}}^{tm} {\begin{bmatrix}1-tm &{} \quad t \\ -tm^2 &{} \quad tm+1\end{bmatrix}}. \end{aligned}$$
    (15)
  3. (iii)

    If \((\beta \lambda + \gamma )^2 < 4 (\alpha \lambda + \delta )\), denoting \(n = \frac{1}{2} \sqrt{4 (\alpha \lambda + \delta ) - (\beta \lambda + \gamma )^2 } \), we get that

    $$\begin{aligned} {\mathrm {e}}^{tG} = \frac{{\mathrm {e}}^{tm}\sin (tn)}{n} {\begin{bmatrix}-m &{} \quad 1 \\ -n^2 - m^2 &{} \quad m\end{bmatrix}} + e^{tm} \cos (tn) I. \end{aligned}$$
    (16)

Proof

Let \(z_1\) and \(z_2\) be the two eigenvalues of the matrix tG. Thus \(z_1\) and \(z_2\) satisfy the characteristic equation \(z^2 + (\beta \lambda + \gamma ) t z + (\alpha \lambda + \delta ) t^2 = 0\). By using formula (12), we obtain that

$$\begin{aligned} {\mathrm {e}}^{tG} = \frac{{\mathrm {e}}^{z_1}-{\mathrm {e}}^{z_2}}{z_1 - z_2} tG + \frac{z_1 {\mathrm {e}}^{z_2} - z_2 {\mathrm {e}}^{z_1}}{z_1 - z_2} I. \end{aligned}$$
(17)

The discriminant of the characteristic equation is

$$\begin{aligned} D = \left( (\beta \lambda + \gamma )^2 - 4(\alpha \lambda + \delta ) \right) t^2. \end{aligned}$$

We consider three cases:

  1. (i)

    If \(D > 0\) or \((\beta \lambda + \gamma )^2 > 4(\alpha \lambda + \delta )\), the two real roots of the characteristic equation are \(z_1 = tm + tn\) and \(z_2 = tm - tn\), where \(m = -\frac{1}{2} (\beta \lambda + \gamma )\) and \(n = \frac{1}{2} \sqrt{(\beta \lambda + \gamma )^2 - 4 (\alpha \lambda +\delta )}\). From the definitions of the parameters m and n, we get that \(-\alpha \lambda - \delta = n^2-m^2\) and \(-\beta \lambda - \gamma = 2m\). We will simplify the two coefficients in formula (17):

    $$\begin{aligned} \frac{{\mathrm {e}}^{z_1}-{\mathrm {e}}^{z_2}}{z_1 - z_2}&= \frac{{\mathrm {e}}^{t(m+n)} - {\mathrm {e}}^{t(m-n)} }{2tn} , \\ \frac{z_1 {\mathrm {e}}^{z_2} - z_2 {\mathrm {e}}^{z_1}}{z_1 - z_2}&= \frac{(m+n){\mathrm {e}}^{t(m-n)}-(m-n) {\mathrm {e}}^{t(m+n)}}{2n} \\&= {\mathrm {e}}^{t(m+n)} - (m+n) \frac{{\mathrm {e}}^{t(m+n)} - {\mathrm {e}}^{t(m-n)} }{2n}. \end{aligned}$$

    By substituting the two simplified coefficients into (17), we get that

    $$\begin{aligned} {\mathrm {e}}^{tG}&= \frac{{\mathrm {e}}^{t(m+n)} - {\mathrm {e}}^{t(m-n)} }{2n} \left( G - (m+n) I \right) + {\mathrm {e}}^{t(m+n)} I \\&= \frac{{\mathrm {e}}^{t(m+n)} - {\mathrm {e}}^{t(m-n)} }{2n} \left( {\begin{bmatrix}0 &{} \quad 1 \\ n^2-m^2 &{} \quad 2m\end{bmatrix}} - {\begin{bmatrix}m+n &{} \quad 0 \\ 0 &{} \quad m+n\end{bmatrix}} \right) + {\mathrm {e}}^{t(m+n)} I \\&= \frac{{\mathrm {e}}^{t(m+n)} - {\mathrm {e}}^{t(m-n)} }{2n} {\begin{bmatrix}-m-n &{} \quad 1 \\ n^2-m^2 &{} \quad m-n \end{bmatrix}} + {\mathrm {e}}^{t(m+n)} I. \end{aligned}$$
  2. (ii)

    If \(D =0\) or \((\beta \lambda + \gamma )^2 = 4(\alpha \lambda + \delta )\), the characteristic equation has only one root \(z_1 = tm\), where \(m = -\frac{1}{2}(\beta \lambda + \gamma )\). In this case, we have

    $$\begin{aligned} {\mathrm {e}}^{tG}&= {\mathrm {e}}^{z_1} (tG) + {\mathrm {e}}^{z_1} (1-z_1) I \\&= {\mathrm {e}}^{tm} t {\begin{bmatrix}0 &{} \quad 1 \\ -m^2 &{} \quad 2m\end{bmatrix}} + {\mathrm {e}}^{tm} (1-tm){\begin{bmatrix}1 &{} \quad 0 \\ 0 &{} \quad 1\end{bmatrix}} = {\mathrm {e}}^{tm} {\begin{bmatrix}1-tm &{} \quad t \\ -tm^2 &{} \quad tm+1\end{bmatrix}}. \end{aligned}$$
  3. (iii)

    If \(D <0\) or \((\beta \lambda + \gamma )^2 < 4(\alpha \lambda + \delta )\), the characteristic equation has two conjugate complex roots \(z_1 = tm +{\mathrm {i}}tn\) and \(z_2 = tm - {\mathrm {i}}tn\), where

    $$\begin{aligned} m = -\frac{1}{2} (\beta \lambda + \gamma ), \quad n = \frac{1}{2} \sqrt{4 (\alpha \lambda + \delta ) - (\beta \lambda + \gamma )^2 }. \end{aligned}$$

    Since \(z_2 = \overline{z_1}\), we infer that \({\mathrm {e}}^{z_2} = \overline{{\mathrm {e}}^{z_1}}\) and \(z_2 {\mathrm {e}}^{z_1} = \overline{z_1} \overline{{\mathrm {e}}^{z_2}} = \overline{z_1{\mathrm {e}}^{z_2}}\). We analogously simplify the two coefficients in (17) as follows

    $$\begin{aligned} \frac{{\mathrm {e}}^{z_1}-{\mathrm {e}}^{z_2}}{z_1 - z_2}&= \frac{{\mathrm {e}}^{z_1} - \overline{{\mathrm {e}}^{z_1}}}{z_1 - \overline{z_1}} = \frac{2{\mathrm{Im}}({\mathrm {e}}^{z_1})}{2{\mathrm{Im}}(z_1)} = \frac{{\mathrm {e}}^{tm} \sin (tn) }{tn} , \\ \frac{z_1 {\mathrm {e}}^{z_2} - z_2 {\mathrm {e}}^{z_1}}{z_1 - z_2}&= \frac{{\mathrm{Im}}(z_1 \overline{{\mathrm {e}}^{z_1}})}{{\mathrm{Im}}(z_1)} = {\mathrm {e}}^{tm} \cos (tn) - m \frac{{\mathrm {e}}^{tm}\sin (tn)}{n} . \end{aligned}$$

    By substituting the two simplified coefficients into (17) and noting that \(-\alpha \lambda - \delta = -n^2-m^2\) and \(-\beta \lambda - \gamma = 2m\), we get that

    $$\begin{aligned} {\mathrm {e}}^{tG}&= \frac{{\mathrm {e}}^{tm}\sin (tn)}{n} (G- m I) + e^{tm} \cos (tn) I \\&= \frac{{\mathrm {e}}^{tm}\sin (tn)}{n} {\begin{bmatrix}-m &{} \quad 1 \\ -n^2 - m^2 &{} \quad m\end{bmatrix}} + e^{tm} \cos (tn) I. \end{aligned}$$

This concludes the proof. \(\square \)

Remark 2.10

Formula (16) is useful in computations. For example, for beam equations with typical physical parameters proposed by Ito and Morris in [21, Section 3], the matrix G has usually two complex conjugate eigenvalues.

3 Exponential Integrators

3.1 Exponential Integrators for Semilinear Problems

We consider semilinear differential equations of the form

$$\begin{aligned} \dot{y}(t) = A y(t) + F(y(t)). \end{aligned}$$
(18)

The solution of this equation at time \(t_{n+1} = t_n + \tau _n,~~~t_0 = 0, n \in {\mathbb {N}}\) is given by the variation-of-constants formula

$$\begin{aligned} y(t_{n+1}) = {\mathrm {e}}^{\tau _n A} y(t_n) + \int _{0}^{\tau _n} {\mathrm {e}}^{(\tau _n - \tau )A} F(y(t_n + \tau )) d\tau . \end{aligned}$$

For the numerical soltuion of (18), we recall a general class of one-step exponential integrators from [17,18,19]

$$\begin{aligned} y_{n+1}&= {\mathrm {e}}^{\tau _n A} y_n + \tau _n \sum _{i=1}^s b_i(\tau _n A) F_{ni}, \\ Y_{ni}&= {\mathrm {e}}^{c_j \tau _n A} y_n + \tau _n \sum _{j=1}^{i-1} a_{ij} (\tau _n A) F_{nj}, \\ F_{nj}&= F(Y_{nj}). \end{aligned}$$

The coefficients are as usually collected in a Butcher tableau

figure a

The method coefficients \(a_{ij}\) and \(b_i\) are constructed from a family of functions \(\varphi _k\) evaluated at the matrix \((\tau _n A)\). We next recall this family \(\varphi _k\), which was introduced before in [17,18,19].

Corollary 3.1

Consider the entire functions

$$\begin{aligned} \varphi _k(z) = \int _0^1 {\mathrm {e}}^{(1-\theta )z} \frac{\theta ^{k-1}}{(k-1)!} d\theta , \quad k \ge 1, \quad \varphi _0(z) = {\mathrm {e}}^z. \end{aligned}$$

These functions satisfy the following properties:

  1. (i)

    \(\displaystyle \varphi _k(0) = \frac{1}{k!} \);

  2. (ii)

    they satisfy the recurrence relation

    $$\begin{aligned} \varphi _{k+1}(z) = \frac{\varphi _{k}(z) - \varphi (0)}{z}; \end{aligned}$$
  3. (iii)

    the Taylor expansion of the function \(\varphi _k\) is

    $$\begin{aligned} \varphi _k (z) = \sum _{n=0}^\infty \frac{z^n}{(n+k)!}. \end{aligned}$$

To simplify notation, we denote

$$\begin{aligned} \varphi _{k,j} = \varphi _k( c_j \tau _n A), \quad \varphi _k = \varphi _k(\tau _n A). \end{aligned}$$

Next, we recall five exponential integrators that will be used in our numerical examples.

Example 3.2

For \(s=1\), the exponential Euler method has the form

$$\begin{aligned} y_{n+1} = {\mathrm {e}}^{\tau _n A} y_n + \tau _n \varphi _1 (\tau _n A) F(y_n). \end{aligned}$$
(19)
figure b

We denote this method by EI-E1.

Example 3.3

For \(s=2\), we recall a second-order method proposed by Strehmel and Weiner in [31, Section 4.5.3]:

figure c

A simplified version, where only \(\varphi _1\) is used, is also proposed by Strehmel and Weiner

figure d

Example 3.4

For \(s=4\), we recall two schemes. The first one, proposed by Krogstad [22], is given by

figure e

The second method is suggested by Strehmel and Weiner (see [31, Example 4.5.5])

figure f

3.2 Computing Matrix Functions of tA

To apply these exponential integrators to semilinear problems, we next introduce an approach to explicitly compute the matrix functions \(\varphi _k(tA)\). We first present an analogous version of Theorem 2.5.

Theorem 3.5

Assume that there exist an orthogonal matrix Q and a diagonal matrix \(D = {\mathrm{diag}}\{\lambda _1, \cdots , \lambda _n\}\) such that \(S = Q D Q'\). Then, for \(t >0\) and \(k \ge 1\), the functions \(\varphi _k(tA)\) are computed as follows

$$\begin{aligned} \varphi _k(tA) = {\begin{bmatrix}Q &{} \quad 0 \\ 0 &{} \quad Q\end{bmatrix}} P {\begin{bmatrix} \varphi _k(tG_1) &{} \quad 0 &{} \quad \cdots &{} \quad 0 \\ 0 &{} \quad \varphi _k(tG_2)&{} \quad \cdots &{} \quad 0 \\ \vdots &{} \quad \vdots &{} \quad \ddots &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad \cdots &{} \quad \varphi _k(tG_n) \end{bmatrix}} P' {\begin{bmatrix}Q' &{} \quad 0 \\ 0 &{} \quad Q'\end{bmatrix}}, \end{aligned}$$
(20)

where \(P \in {\mathbb {R}}^{2n \times 2n}\) is given in (10) and \(G_i = {\begin{bmatrix}0 &{} 1 \\ -\alpha \lambda _i - \delta &{} -\beta \lambda _i - \gamma \end{bmatrix}}\).

The matrix functions \(\varphi _k(tG_i)\) are computed explicitly. The actual formula depends on the sign of \((\beta \lambda _i + \gamma )^2-4(\alpha \lambda _i + \delta )\). Next, we will present two lemmas concerning these functions.

Lemma 3.6

Assume that the matrix G is of the form

$$\begin{aligned} G = {\begin{bmatrix}0 &{} \quad 1 \\ -\alpha \lambda - \delta &{} \quad -\beta \lambda - \gamma \end{bmatrix}} \end{aligned}$$

and denote \(m = -\frac{1}{2} (\beta \lambda + \gamma )\).

  1. (i)

    If \((\beta \lambda + \gamma )^2 > 4 (\alpha \lambda + \delta )\), denoting \(n = \frac{1}{2} \sqrt{(\beta \lambda + \gamma )^2-4 (\alpha \lambda + \delta ) } \), the matrix functions \(\varphi _k(tG)\) can be computed explicitly as follows

    $$\begin{aligned} \varphi _k(tG) = \frac{\varphi _k^+ -\varphi _k^-}{2n}{\begin{bmatrix} -m-n &{} \quad 1 \\ n^2 - m^2 &{} \quad m-n \end{bmatrix}} + \varphi _k^+ I, \end{aligned}$$
    (21)

    where \(\varphi _k^+ = \varphi _k(t(m+n))\) and \(\varphi _k^- = \varphi _k(t(m-n))\).

  2. (ii)

    If \((\beta \lambda + \gamma )^2 = 4 (\alpha \lambda + \delta )\), we obtain that

    $$\begin{aligned} \varphi _k(tG) = \varphi '_k(tm) {\begin{bmatrix}-tm &{} \quad t \\ -tm^2 &{} \quad tm\end{bmatrix}} + \varphi _k(tm) I, \end{aligned}$$
    (22)

    where the derivative \(\varphi ^\prime _k(z)\) can be computed recursively

    $$\begin{aligned} \varphi _0(z)&= \varphi _0^\prime (z) = {\mathrm {e}}^z, \\ \varphi ^\prime _{k+1}(z)&= \frac{\varphi _k'(z) - \varphi _{k+1}(z)}{z}, \quad \varphi _{k+1}(z) = \frac{\varphi _k(z) - \varphi _{k}(0)}{z}. \end{aligned}$$

Proof

By applying Theorem 2.8, the proof follows the lines of the first two points in the proof of Lemma 2.9. \(\square \)

The last lemma concentrates on the case of two complex eigenvalues of \(G_i\). Again the idea is to compute the matrix functions without explicitly using complex numbers. It is inspired by formula (16) above.

Lemma 3.7

In the case \((\beta \lambda + \gamma )^2 < 4 (\alpha \lambda + \delta )\), the matrix G has two conjugate complex eigenvalues \(z_1\) und \(z_2\) with \(z_1 = tm + i tn\), where

$$\begin{aligned} m = -\frac{1}{2} (\beta \lambda + \gamma ), \quad n = \frac{1}{2} \sqrt{4 (\alpha \lambda + \delta ) - (\beta \lambda + \gamma )^2 }. \end{aligned}$$

The matrix \(\varphi _k(tG)\) can be explicitly computed as follows

$$\begin{aligned} \varphi _k(tG) = \frac{i_k}{n} {\begin{bmatrix}-m &{} \quad 1 \\ -n^2 - m^2 &{} \quad m\end{bmatrix}} + r_k I. \end{aligned}$$
(23)

Here, the two coefficients \(i_k\) and \(r_k\) depend on (tmn) and can be computed recursively as follows

$$\begin{aligned} i_0&={\mathrm {e}}^{tm} \sin (tn), \quad r_0 = {\mathrm {e}}^{tm} \cos (tn), \end{aligned}$$
(24a)
$$\begin{aligned} i_{k}&= \frac{1}{t(m^2 + n^2)} \left( m i_{k-1} - n \left( r_{k-1}- \frac{1}{(k-1)!} \right) \right) , \end{aligned}$$
(24b)
$$\begin{aligned} r_{k}&= \frac{1}{t(m^2 + n^2)} \left( n i_{k-1} + m \left( r_{k-1}- \frac{1}{(k-1)!} \right) \right) . \end{aligned}$$
(24c)

Proof

By using formula (12), we obtain that

$$\begin{aligned} \varphi _k(tG) = \frac{\varphi _k(z_1)-\varphi _k(z_2) }{z_1 - z_2} tG + \frac{z_1\varphi _k(z_2)-z_2\varphi _k(z_1) }{z_1 - z_2} I. \end{aligned}$$

First, we note that \(\varphi _{k}(z_2) = \overline{\varphi _{k}(z_1)}\) because \(\varphi _k\) has real coefficients. Thus we can simplify as follows

$$\begin{aligned} \frac{\varphi _k(z_1)-\varphi _k(z_2) }{z_1 - z_2} = \frac{{\mathrm{Im}}(\varphi _k(z_1))}{{\mathrm{Im}}z_1}, \quad \frac{z_1\varphi _k(z_2)-z_2\varphi _k(z_1) }{z_1 - z_2} = \frac{{\mathrm{Im}}(z_1\overline{\varphi _k(z_1)})}{{\mathrm{Im}}z_1}. \end{aligned}$$

Next, we rewrite the recursion as follows

$$\begin{aligned} \varphi _{k+1}(z_1) = \frac{\varphi _k(z_1)- \varphi _k(0)}{z_1} = \frac{(\varphi _k(z_1)- \varphi _k(0)) \overline{z_1}}{|z_1|^2}. \end{aligned}$$

To simplify notation, we denote \(i_{k} = {\mathrm{Im}}(\varphi _{k} (z_1))\) and \(r_{k} = {\mathrm{Re}}(\varphi _{k} (z_1))\). Thus we obtain that

$$\begin{aligned} i_{k+1}&= {\mathrm{Im}}(\varphi _{k+1}(z_1)) = \frac{1}{|z_1|^2} {\mathrm{Im}}\Big ((\varphi _k(z_1) - \varphi _k(0)) \overline{z_1} \Big ) \\&= \frac{1}{t^2(m^2+n^2)}\Big ( {\mathrm{Im}}(\varphi _k(z_1) - \varphi _k(0)) {\mathrm{Re}}(\overline{z_1}) + {\mathrm{Re}}(\varphi _k(z_1) - \varphi _k(0)) {\mathrm{Im}}(\overline{z_1}) \Big )\\&= \frac{1}{t(m^2+n^2)}\left( m i_k - n \left( r_k - \frac{1}{k!}\right) \right) , \\ r_{k+1}&= {\mathrm{Re}}(\varphi _{k+1}(z_1)) = \frac{1}{|z_1|^2} {\mathrm{Re}}\Big ((\varphi _k(z_1) - \varphi _k(0)) \overline{z_1} \Big ) \\&=\frac{1}{t^2(m^2+n^2)} \Big ( {\mathrm{Re}}(\varphi _k(z_1) - \varphi _k(0)) {\mathrm{Re}}(\overline{z_1}) - {\mathrm{Im}}(\varphi _k(z_1) - \varphi _k(0)) {\mathrm{Im}}(\overline{z_1}) \Big ) \\&= \frac{1}{t(m^2+n^2)} \left( n i_k + m \left( r_k - \frac{1}{k!} \right) \right) . \end{aligned}$$

Besides, we also get that \( {\mathrm{Im}}(z_1 \overline{\varphi _{k}(z_1)}) = tn r_k - tm i_k \). This finally yields that

$$\begin{aligned} \varphi _k(tG)&= \frac{i_k}{n} G + \frac{nr_k - mi_k}{n} I = \frac{i_k}{n} (G-m I) + r_k I \\&= \frac{i_k}{n} {\begin{bmatrix}-m &{} \quad 1 \\ -n^2 - m^2 &{} \quad m\end{bmatrix}} + r_k I, \end{aligned}$$

which completes the proof. \(\square \)

3.3 Summary of the Integration Procedure

The above described procedure can be summarized in two main parts. The Prepartion part, which is done once at the beginning, consists of three steps:

  1. P1:

    Discretize the operator \((-\Delta )\) or \(\partial _{xxxx}^4\) as a square symmetric matrix S (e.g., by finite differences, see also Sect. 4).

  2. P2:

    Find an orthogonal matrix Q and a diagonal matrix \(D = {\mathrm{diag}}\{\lambda _1, \dots , \lambda _n \}\) such that \(S = Q D Q'\). The matrix D is stored as a vector.

  3. P3:

    Create a column vector which stores the positions of all non-zero entries of the permutation matrix P by using formula (10).

The Main part is used to compute the action of a matrix functions. This is required in the time stepping. Computing this action consists of two steps:

  1. M1:

    Compute the matrix functions \(\varphi _k(tG_i)\) by using formulas (14), (15), or (16) for \(\varphi _0(tG_i) = {\mathrm {e}}^{tG_i}\); formulas (21), (22), or (23) for \(\varphi _k(tG_i)\) with \(k \ge 1\).

  2. M2:

    Compute the action of the matrix functions \(\varphi _k(tA)\) using formula (11) for \(k = 0\) and formula (20) for \(k \ge 1\).

4 Numerical Examples

4.1 Semilinear Wave Equations

We consider a 1D semilinear wave equation on \(\Omega = (0,\ell )\)

$$\begin{aligned} \partial _{tt}^2u&- \alpha \partial _{xx}^2u - \beta \partial _{xxt}^3u + \delta u + \gamma \partial _tu = {g(u) + h(\partial _tu)}, \quad 0< x < \ell ,~~t \in (0,T], \end{aligned}$$
(25a)
$$\begin{aligned} u(t,0)&= 0, \quad u(t,\ell ) = 0, \end{aligned}$$
(25b)
$$\begin{aligned} u(0,x)&= p(x), \quad u_t(0,x) = q(x). \end{aligned}$$
(25c)

We consider the product space \(X = H^1_0(\Omega ) \times L^2(\Omega )\) and rewrite (25) in abstract form

$$\begin{aligned} \dot{y}(t) = {\mathcal {A}}y(t)+ {\mathcal {F}}(y(t)), \quad y(0)= (p, q)', \end{aligned}$$

where \({\mathcal {A}}: D({\mathcal {A}}) \rightarrow X\) is the operator defined by

$$\begin{aligned} {\mathcal {A}}{\begin{bmatrix}u \\ w\end{bmatrix}}&= {\begin{bmatrix}0 &{} \quad I \\ -\alpha (-\partial _{xx}^2) - \delta I &{} \quad - \beta (-\partial _{xx}^2)- \gamma I \end{bmatrix}}{\begin{bmatrix}u \\ w\end{bmatrix}} \\&= {\begin{bmatrix}w \\ -\alpha (-\partial _{xx}^2u) - \beta (-\partial _{xx}^2w) - \delta u- \gamma w \end{bmatrix}}, \\ D({\mathcal {A}})&= \left( H^2(\Omega ) \cap H_0^1(\Omega ) \right) ^2 , \\ {\mathcal {F}}{\begin{pmatrix}u \\ w\end{pmatrix}}&= {\begin{bmatrix}0 \\ {g(u) + h(w)}\end{bmatrix}}. \end{aligned}$$

Define the closed self-adjoint positive operator \({\mathcal {A}}_0\) on \(L^2(0,\ell )\) by

$$\begin{aligned} {\mathcal {A}}_0 \phi&= -\partial _{xx}^2\phi , \\ D({\mathcal {A}}_0)&= \{ \phi \in H^2(0,\ell ) \mid \phi (0) = \phi (1) =0 \}. \end{aligned}$$

We use symmetric finite differences to discretize the operator \({\mathcal {A}}_0\). For this, the space interval \((0,\ell )\) is divided equidistantly by the nodes \(x_i = i \Delta x,~~i \in \{0, \dots , N+1 \}\), where N is a given integer and \(\Delta x = \frac{\ell }{N+1}\). Then, the discrete operator \({\mathcal {A}}_0\) is given by the matrix \(S_w \in {\mathbb {R}}^{N \times N}\) defined by

$$\begin{aligned} S_w = \frac{1}{\Delta x^2}{\begin{bmatrix} 2 &{} \quad -1 &{} \quad 0 &{} \quad \ldots &{} \quad 0 \\ -1 &{} \quad 2 &{} \quad -1 &{} \quad \ddots &{} \quad \vdots \\ 0 &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad 0 \\ \vdots &{} \quad \ddots &{} \quad -1 &{} \quad 2 &{} \quad -1 \\ 0 &{} \quad \ldots &{} \quad 0 &{} \quad -1 &{} \quad 2 \end{bmatrix}}. \end{aligned}$$
(26)

In the four examples below, we consider the space interval \(\Omega =(0,1)\) with \(N = 200\).

Example 4.1

Consider Eq. (25) with \(\alpha =\pi ^2,~\beta = 10^{-2},~\delta = 0,~\gamma = 10^{-2}\). The nonlinear source term is \(g(u) = \sin u\). This is a perturbed sine-Gordon equation of the form (2). The initial conditions are \(p(x) = 5 \sin (2\pi x)\) and \(q(x) = 0\). We use four different schemes, namely EI-E1, EI-SW21 (with \(c_2 = 0.75\)), EI-SW4, and EI-K4 to compute the solution at time \(T = 6\) with \(M \in \{5, 10, 20, \cdots , 5 \cdot 2^{11}\}\) time steps. The reference solution \(y_{ref} = (u_{ref}, (\partial _tu)_{ref})'\) plotted in Fig. 1a, b is computed by using EI-SW4 with \(M = 200000\) time steps. The discrete \(\ell _2\) error between the approximate solution obtained with the mentioned integrators at the final time \(y(T) = (u(T), \partial _tu(T))'\) and the reference solution \(y_{ref}\) is computed by the formula

$$\begin{aligned} \Vert y(T) - y_{ref} \Vert ^2_{\ell _2} = \Delta x \sum _{i=1}^N |y_i(T) -y_{ref,i}|^2. \end{aligned}$$
(27)

These errors are plotted in Fig. 1c. The expected convergence rate is observed for each scheme. Even when we use a rather coarse time mesh with \(M = 5\) and \(\Delta t= 1.2\), the error is quite small (approximate \(10^{-1}\)).

Fig. 1
figure 1

Example 4.1

Example 4.2

Consider Eq. (25) with \(\alpha = 100,~\beta = 10^{-2},~\delta = 0,~\gamma = 10^{-3}\). The nonlinear source term is \(g(u) = u|u|\). The initial conditions are

$$\begin{aligned} p(x) = {\left\{ \begin{array}{ll} 2x \quad &{}\text {if} \quad x \le \frac{1}{2}, \\ -2x+2 \quad &{}\text {if} \quad x > \frac{1}{2}, \end{array}\right. } \quad \quad q (x) = \pi ^2 \sin (\pi x). \end{aligned}$$

We use five different schemes, namely EI-E1, EI-SW21, EI-SW22 (both schemes with \(c_2 = 0.2\)), EI-SW4, and EI-K4 to compute the solution at time \(T = 15\) with \(M \in \{20, 40, 80, \cdots , 20 \cdot 2^{12}\} \) time steps. The reference solution plotted in Fig. 2a, b is computed by using EI-K4 with \(M = 200000\). The errors are plotted in Fig. 2c.

Fig. 2
figure 2

Example 4.2

Example 4.3

Consider Eq. (25) with \(\alpha = 15,~\beta = 10^{-3},~ \delta = 1,~\gamma = 10^{-6}\). The nonlinear source term is \(g(u) = u^3\). The initial conditions are \(p(x) = 10 \sin (3\pi x),~q (x) = -10 \cos (3\pi x) \). We use all five schemes, namely EI-E1, EI-SW22 (with \(c_2 = 0.9\)), EI-K4, and EI-SW4 to compute the solution at time \(T = 30\) with \(M \in \{20, 40, 80, \cdots , 20 \cdot 2^{11}\}\) time steps. We plot the reference solution computed with EI-SW4 and \(M = 300000\) in Fig. 3a, b. Again the expected convergence rates are observed for each scheme and plotted in Fig. 3c. An order reduction occurs for EI-K4 (reduction to order 2) while EI-SW4 still preserves its convergence rate.

Fig. 3
figure 3

Example 4.3

Example 4.4

This example concerns discontinuous initial conditions

$$\begin{aligned} p(x) = {\left\{ \begin{array}{ll} -1\quad &{}\text {if} \quad x \le \frac{1}{2}, \\ 5\quad &{}\text {if} \quad x > \frac{1}{2}, \end{array}\right. } \quad \quad q(x) = 0. \end{aligned}$$

The other parameters are \(\alpha = 5,~~\beta = 10^{-3},~\delta = 1,~\gamma = 10^{-4}\). The nonlinear term is \(g(u) = |u|\). The approximate solutions at \(T = 3\) are computed by using five exponential integrators, namely EI-E1, EI-SW21, EI-SW22 (both schemes with \(c_2 = 0.5\)), EI-SW4, and EI-K4 with \(M \in \{20, 40, \dots , 20\cdot 2^{13}\}\) time steps. The reference solution computed with \(M = 300000\) time steps by using EI-SW4 is plotted in Fig. 4a, b. The errors are plotted in Fig. 4c. We observe an order reduction to order 2 for the two fourth-order exponential integrators EI-SW4 and EI-K4 while the other integrators preserve their convergence rate.

Fig. 4
figure 4

Example 4.4

Example 4.5

The last example concerns two nonlinear terms, namely \(g(u) = -u |u|^3\), \(h(w) = -w|w|\). The other parameters are \(\alpha = 50,~\beta = 10^{-6},~\delta = 10,~\gamma = 10^{-3}\). The initial conditions are \(p(x) = 20 \sin (4 \pi x)\) and \(q(x) = -25 \cos (3\pi x) \). The approximate solution at \(T=1\) are computed by using five exponential integrators, namely EI-E1, EI-SW21, EI-SW22 (both schemes with \(c_2 = 0.85\)), EI-SW4, and EI-K4 with \(M \in \{160, 320, \dots , 160 \cdot 2^{10}\}\) time steps. The reference solution is computed by EI-SW4 with \(M = 800000\) and plotted in Fig. 5a, b. The convergence rates are plotted in Fig. 5c. The two fourth-order exponential integrators EI-SW4 and EI-K4 show order reductions. While EI-SW4 works still well with an order reduction to order 3; EI-K4 on the other hand works badly.

4.2 Railway Track Model

Assume that a track beam is made of Kelvin-Voigt material. The resulting railway track model is a semilinear PDE on \(\Omega = (0,L)\):

$$\begin{aligned} \partial _{tt}^2u&+ \partial _{xx}^2(\alpha \partial _{xx}^2u + \beta \partial _{xx}^2(\partial _tu) ) + \gamma \partial _tu + \delta u = -lu^3, \quad x \in \Omega , t \in (0,T], \end{aligned}$$
(28a)
$$\begin{aligned} u(t,0)&= u(t,L) = 0, \end{aligned}$$
(28b)
$$\begin{aligned} \alpha \partial _{xx}^2u (t,0)&+ \beta \partial _{xx}^2(\partial _tu) (t,0) = \alpha \partial _{xx}^2u (t,L) + \beta \partial _{xx}^2(\partial _tu) (t,L) = 0, \end{aligned}$$
(28c)
$$\begin{aligned} u(0,x)&= p(x), \quad \partial _tu (0,x) = q(x). \end{aligned}$$
(28d)

Denote the closed self-adjoint positive operator \({\mathcal {A}}_0\) on \(L^2(0,L)\) as

$$\begin{aligned} {\mathcal {A}}_0 \phi&{:}{=}\partial _{xxxx}^4\phi , \\ D({\mathcal {A}}_0)&{:}{=}\{\phi \in H^4(\Omega ) \mid \phi (0) = \phi (L) = 0,~~ \phi ''(0) = \phi ''(L) = 0 \}. \end{aligned}$$

Concerning the analysis of the linear operators, we refer to the literature [2, 26]. We use finite differences to discretize the operator \({\mathcal {A}}_0\) with an equidistant space mesh \(x_i = i \Delta x,~~ i \in \{0,\dots ,N+1\} \), where N is a given integer and \(\Delta x = \frac{L}{N + 1}\). Then the discrete operator \({\mathcal {A}}_0\) is given by the matrix \(S_b \in {\mathbb {R}}^{N \times N}\), defined as below

$$\begin{aligned} S_{b} = \frac{1}{\Delta x^{4}} \left[ \begin{array}{cccccccccccc} 5 &{} \quad -4 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad \dots &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ -4 &{} \quad 6 &{} \quad -4 &{} \quad 1 &{} \quad 0 &{} \quad 0 &{} \quad \dots &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 1 &{} \quad -4 &{} \quad 6 &{} \quad -4 &{} \quad 1 &{} \quad 0 &{} \quad \dots &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ 0 &{} \quad 1 &{} \quad -4 &{} \quad 6 &{} \quad -4 &{} \quad 1 &{} \quad \dots &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0 \\ \vdots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ \vdots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ \vdots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ \vdots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ \vdots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ 0 &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \ddots &{} \quad 1 &{} \quad -4 &{} \quad 6 &{} \quad -4 &{} \quad 1 \\ 0&{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \ddots &{} \quad 0 &{} \quad 1 &{} \quad -4 &{} \quad 6 &{} \quad -4 \\ 0&{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad \dots &{} \quad 0 &{} \quad 1 &{} \quad -4 &{} \quad 5 \\ \end{array}.\right] \end{aligned}$$
(29)
Fig. 5
figure 5

Example 4.5

Example 4.6

Consider Eq. (28) with \(\alpha = 15,~\beta = 3 \cdot 10^{-6},~\delta = 10,~\gamma = 3\cdot 10^{-4}\). The nonlinear term is \(g(u) = -5u^3\). The initial conditions are

$$\begin{aligned} p(x) = 5 {\mathrm {e}}^{-100\left( x-\frac{2}{3}\right) ^2},\quad q(x) = 0. \end{aligned}$$

For our numerical solution, the space interval \(\Omega = (0,1)\) is divided into 300 equidistant subintervals. We compute approximate solutions at \(T = 5\) with four exponential integrators EI-E1, EI-SW22 (\(c_2 = 0.9\)), EI-SW4, and EI-K4 with \(M \in \{160,320,\dots , 160 \cdot 2^{10}\}\) time steps. We compare these numerical results with the reference solution evaluated by EI-K4 with \(M = 600000\) time steps. The reference solution is plotted in Fig. 6a, b. Notice that the magnitude of the velocity \(\partial _tu\) is extremely big. The errors are plotted in Fig. 6c. The four exponential integrators preserve their convergence rate. Since the matrix \(S_b\) is stiffer than the one \(S_w\), the computation for beam equations is more expensive than for wave equations. However, we note that solving beam equations with exponential integrators is a good option. Some comparisons in the next section will elucidate this point.

Fig. 6
figure 6

Example 4.6

4.3 Comparisons with Standard Integrators

Some comparisons between our approach and standard integrators will be presented to clarify the efficiency. In particular, we consider the explicit Runge-Kutta method ode45 from MATLAB. Note that ode45 needs sufficiently small time steps to guarantee stability. The CFL condition depends on the set up of our model. For example, the larger the parameter \(\alpha \) we choose in the example, the smaller the time step \(\Delta t\) has to been chosen. Besides, the maximum step size depends on the type of equation, i.e. the beam equation is stiffer than the wave equation. In particular, the relation between \(\Delta t\) and \(\Delta x\) is of the form: \(\Delta t \sim (\Delta x)^2\) for beam equations, and \(\Delta t \sim \Delta x\) for wave equations.

Stiff problems are often solved with implicit schemes. Therefore, we will make another comparison with a class of implicit Runge-Kutta methods, namely the Radau IIA methods (see [14, Section IV-5]). Though these methods do not require any CFL condition to guarantee stability, their computational cost is high since they require the solution of linear systems with large matrices. Below, we illustrate by some examples that both explicit and implicit Runge-Kutta methods are more expensive than our exponential integrators in the present context.

Example 4.7

Exact matrix exponential versus ode45 and Radau scheme for a linear example. Consider a linearized version of Eq. (25) with \(\alpha =100,~\beta = 10^{-2},~\delta = 10^{-2} ,~\gamma = 10^{-6}\). The initial conditions are \(p(x) = 5 \sin (2\pi x)\) and \(q(x) = 0\). We consider the space interval \(\Omega = (0,1)\) with \(N = 200\) (number of grid points).

We compute the solution at time \(T = 10\) by using different methods. For any exponential integrator, the solution is obtained immediately by the formula

$$\begin{aligned} y(T) = {\begin{bmatrix}u(T) \\ w(T)\end{bmatrix}} = {\mathrm {e}}^{TA} {\begin{bmatrix}p \\ q\end{bmatrix}}, \quad A = {\begin{bmatrix}0 &{} I \\ -\alpha S_w - \beta I &{} - \beta S_w - \gamma I\end{bmatrix}}, \end{aligned}$$

where \(S_w\) was defined in (26). The computational time for using this approach is 0.036s.

For comparisons, we use an explicit Runge-Kutta method, namely ode45 from MATLAB to obtain the solution at final time \(T =10\) with various tolerances. Due to the stability reason, ode45 needs a huge number of time steps. The minimum number of time steps M which ode45 needs to achieve the corresponding accuracy are presented in the second column of Table 1. The relating computational time is reported in the third column. As implicit method, we use the Radau IIA scheme. The minimum number of time steps which RadauFootnote 1 needs to obtain the corresponding accuracy is shown in the fourth column of Table 1. The relating computational time is reported in the fifth column. The implicit scheme is more efficient than the explicit one for solving this linear equation. Obviously to tackle the linear case, both ode45 and Radau are expensive choices.

Table 1 Number of time steps M that ode45 and Radau need to compute the solution at \(T = 10\) and the corresponding computational time

Example 4.8

EI-K4 verus ode45 and Radau for a semilinear wave equation. Consider the Eq. (28) with \(\alpha = 100,~\beta = \gamma = 10^{-3},~\delta = 10\). The nonlinear source term is \(g(u) = u^2\). The initial conditions are

$$\begin{aligned} p(x) = {\left\{ \begin{array}{ll} 2x \quad &{}\text {if} \quad x \le \frac{1}{2}, \\ -2x+2 \quad &{}\text {if} \quad x > \frac{1}{2}, \end{array}\right. } \quad \quad q (x) = \pi ^2 \sin (\pi x). \end{aligned}$$

We compute the solution at the final time \(T = 15\) by EI-K4, ode45, and Radau. To solve semilinear equations, three integrators need a sufficient number of substeps to attain the solution at final time. Corresponding to the desired accuracy, the number of steps as well as the required computational time of two schemes are reported in the Table 2. For all accuracies, EI-K4 is more efficient than the two other schemes. Even though Radau needs less number of time steps to achieve the desired accuracy, the computational cost is really high since it needs to solve a linear system involving a large matrix at each time step.

Table 2 Comparison among EI-K4, ode45, and Radau for a semilinear wave equation

Example 4.9

EI-K4 verus ode45 and Radau for a semilinear beam equation. We repeat Example 4.6 with a smaller matrix \(S_b\), i.e., the space interval \(\Omega = (0,1)\) is divided into 200 equidistant subintervals. The solution at time \(T=1\) is computed with EI-K4, ode45, and Radau. The number of time steps and the corresponding computational time are presented in Table 3 for some desired tolerances. As we mentioned in the beginning of this section, this problem is stiffer than the wave equation. To tackle this challenging situation, ode45 needs more than 1 millions time steps to achieve the solution even for a low tolerance like \(10^{-2}\). On the other hand, the implicit Radau scheme requires less number of time steps to obtain a desired accuracy. However, its computational cost is very high.

Table 3 Comparison among EI-K4, ode45, and Radau for a semilinear beam equation

Example 4.10

Adding all damping terms into the nonlinear part. Another common approach for solving (25) consists in merging the damping terms with the nonlinear terms and then employing various exponential integrators to solve the resulting semilinear equation. In this case, we have to solve the system

$$\begin{aligned} \dot{y}(t) = \widetilde{A} y(t) + \widetilde{F}(y(t)), \end{aligned}$$

where

$$\begin{aligned} \widetilde{A} = {\begin{bmatrix} 0 &{} \quad I \\ -\alpha S - \delta I &{} \quad 0 \end{bmatrix}}, \quad \widetilde{F}(y(t)) = \widetilde{F} {\begin{pmatrix}u \\ w\end{pmatrix}} = {\begin{bmatrix}0 \\ g(u) + h(w) - \beta Sw - \gamma w \end{bmatrix}}. \end{aligned}$$

The matrix S is either \(S = S_w\) as in (26) or \(S = S_b\) as in (29). The exponential function of \(\widetilde{A}\) can be computed by the following formula

$$\begin{aligned} {\mathrm {e}}^{t\widetilde{A}} = {\begin{bmatrix}\cos (t\Omega ) &{} \quad \Omega ^{-1} \sin (t\Omega ) \\ -\Omega \sin (t\Omega ) &{} \quad \cos (t\Omega )\end{bmatrix}}, \quad \Omega = \sqrt{\alpha S + \delta I}, \end{aligned}$$
(30)

An explicit form of \(\varphi _k(t\widetilde{A})\) was also presented in the literature (for example: see [32]). We illustrate by two examples below the claim that this approach is more expensive and it also lacks accuracy.

Consider a wave Eq. (25) with \(\alpha =1,~\beta = 10^{-2},~\delta = 1,~\gamma = 10^{-1}\). The nonlinear term is \(g(u) = -5u^3\). The initial conditions are \(p(x) = 5\sin (5\pi x)\) and \(q(x) = 5\cos (10 \pi x)\). We compute the solution at \(T=1\) by using EI-SW21 (\(c_2=\frac{1}{3}\)) with \(M \in \{10,20,\dots , 10 \cdot 2^{10}\}\) time steps. We compare these numerical results with the reference solution evaluated by EI-K4 with \(M = 100000\) time steps. The convergence rates of two approaches are plotted in Fig. 7a. When we add all damping terms into the nonlinear part, the corresponding approximation is worse than the one given by our approach. Note that the errors are reduced approximately 100 times with our approach. Moreover, with a bigger \(\Delta t\), we obtain the solution with an acceptable accuracy while the traditional approach needs much smaller time steps to achieve stability.

We next repeat Example 4.6. The solution at final time step \(T=1\) is computed by using EI-SW21 (\(c_2=0.2\)) with \(M \in \{320,\dots , 320 \cdot 2^{7}\}\) time steps. These numerical solutions are compared with the reference solution obtained by using EI-SW4 with \(M = 100000\) time steps. The errors are plotted in Fig. 7b. While our approach works and preserves the convergence rate of the exponential integrator EI-SW21, the traditional approach fails even with a small \(\Delta t\). Since the matrix \(S_b\) is stiffer than the matrix \(S_w\), it leads to the stability problem when the structural damping term \(\partial _{xxxxt} u\) is added into the nonlinear part.

In conclusion, the two examples clearly demonstrate the importance of using the matrix exponential of the linearization.

Fig. 7
figure 7

The importance of using the matrix exponential of the linearization

5 Conclusion

We presented an approach to cheaply compute the action of the matrix exponential \({\mathrm {e}}^{tA}\) as well as the action of the matrix functions \(\varphi _k(tA)\) on a given vector by employing two linear transformations. Thus, the solution of certain linear differential equations can be computed in a fast and efficient way. By applying the exponential integrators in the literature, we can solve semilinear wave and semilinear beam equations.

Note that the described procedure can be extended to the case

$$\begin{aligned} A = \begin{bmatrix} A_1 &{} A_2 \\ A_3 &{} A_4 \end{bmatrix}, \quad \text {where } [A_i, A_j] = A_i A_j - A_j A_i = 0 \quad \text {for } 1 \le i,j \le 4,~i \ne j. \end{aligned}$$

Indeed, under the above assumption, the four matrices \(A_i\) share the same eigenvalues and eigenvectors. Thus, there exist a matrix Q and four corresponding diagonal matrices \(D_i\) such that \(A_i = Q D_i Q'\) for \(1 \le i \le 4\). This implies that

$$\begin{aligned} A = \begin{bmatrix} Q &{} \quad 0 \\ 0 &{} \quad Q \end{bmatrix} \begin{bmatrix} D_1 &{} \quad D_2 \\ D_3 &{} \quad D_4 \end{bmatrix} \begin{bmatrix} Q' &{} \quad 0 \\ 0 &{} \quad Q' \end{bmatrix} \end{aligned}$$

Thus, the scheme can be analogously applied by evaluating the exponential of each \(2 \times 2\) block matrix \({\begin{bmatrix}\lambda _{1,i} &{} \lambda _{2,i} \\ \lambda _{3,i} &{} \lambda _{4,i}\end{bmatrix}}\) where \(\lambda _{k,i}\) is an entry of the diagonal matrix \(D_k\) with \(1 \le k \le 4\).