1 Introduction

Many scientific and engineering problems require numerical methods that couple different wave phenomena. Numerous applications can be found within computational electrodynamics, computational geophysics, and computational fluid dynamics. Typical examples include the Euler equations coupled to the elastic wave equation, and electromagnetic wave interactions with scattering across material interfaces. A difficulty that can arise when developing numerical methods for these types of coupled problems is that the resulting discretization may suffer from stiffness. In such a case, an explicit time stepping procedure may require a significantly smaller time step than what is expected from each separate subproblem. For example, within fluid-structure interaction, the impact of the coupling procedure is a well-known problem. The so-called added mass instability is an effect of coupling light fluids to thin, dense structures immersed in fluid. This problem has led to a plethora of different coupling strategies, e.g., see [2, 6, 32]. Potential stiffness issues related to large density contrasts also appear in linear settings when coupling acoustic and elastic wave equations. Stiffness can also appear in other settings, e.g., for frictional sliding interfaces [17], and narrow cracks [24].

The imposition of coupling conditions is closely related to the imposition of boundary conditions where stiffness can occur for similar reasons. In this paper, we will for ease of presentation switch between these two use cases. We focus on weakly imposed boundary and coupling conditions by including additional source terms in the governing equations, so-called penalty terms. The role of the penalty terms is to force the numerical solution on the interface towards satisfying the actual coupling conditions [4, 30]. The penalty term can be interpreted as specific weightings of the coupling conditions in combination with the equations.

There is some flexibility in choosing penalty weights, and the choice can have a striking impact on a scheme’s performance. Well-designed penalty terms can be the difference between having a scheme that works in practice or a suboptimal scheme that runs orders of magnitude slower. Unfortunately, it is quite easy to construct suboptimal penalty terms that cause these problems, even though they are semi-discretely stable. Coupling procedures that overcome these problems use either upwind numerical fluxes based on solving the Riemann problem [34], or a specific characteristic treatment [17, 20, 21]. In each of the above cases, the penalty terms introduce some amount of potentially unwanted additional dissipation.

Adding certain types of artificial dissipation may result in improved accuracy [16, 21]. However, dissipation may be suboptimal for wave propagation problems since that may prevent the use of symplectic or staggered time-stepping methods that have improved accuracy and stability properties [33]. For instance, a benefit of the fourth order staggered Runge–Kutta time-stepping scheme introduced in [11] is that it has a factor 16 smaller truncation error and a factor of two larger stability region along the imaginary axis compared to the classical fourth order Runge–Kutta scheme. Unfortunately, the staggered Runge–Kutta scheme is unstable even for a small amount of artificial dissipation [11, 33]. An energy conservative weak coupling procedure for coupling acoustic materials across nonconforming grids is presented in [8]. However, this coupling procedure causes stiffness for problems with large density contrasts. To the best of our knowledge, there are no known penalty procedures that are both energy conservative and non-stiff. In this paper, we present a systematic procedure for constructing both energy conservative and dissipative penalty terms that are provably non-stiff.

We argue that numerical schemes should be provably stable and non-stiff. A powerful method for proving semi-discrete stability of a scheme is the energy method [13]. This method give sufficient conditions for semi-discrete stability, but not necessary ones. However, when the energy method fails, it is usually an indication that there is an underlying instability present that may be triggered under certain circumstances. Without a proof of stability, there are no guarantees that the output produced by a numerical scheme is reliable. A proof of stability also helps to rule out bugs and can serve as a strict test of the implementation. For instance, one can randomly initialize all solution fields and model parameters. Afterwards, one numerically computes the discrete energy rate in the implementation and asserts that it is non-positive.

Similar arguments can be made for having a provably non-stiff scheme. A proof of non-stiffness gives a sufficient condition in order to prevent stiffness from occurring due to sensitivity in model parameters. We will show that if the matrix norm of each penalty matrix can be bounded by the CFL condition, it is provably non-stiff. Since this is only a sufficient condition, a lack of proof does not imply that the scheme is stiff. However, failure to bound the matrix norm could be an indication that the numerical scheme is sensitive to certain parameter values. A proof of non-stiffness also helps to rule out bugs in the implementation. For instance, if the simulation of a provably stable and non-stiff scheme becomes unstable due to a change in a model parameter independent of the CFL-condition, it must be caused by an incorrect implementation. In addition, since the proof of non-stiffness constrains the parameters to some degree, this analysis can limit the potential parameter space that one must probe to select optimal parameter values.

The key developments in this work are applicable to any numerical scheme in Summation-By-Parts (SBP) form that weakly imposes boundary or coupling conditions via Simultaneous-Approximation-Terms (SAT). Numerical schemes that are in SBP-SAT form lead to a proof of stability via the energy method and apply to finite difference [5, 19, 28], finite volume [22], spectral collocation [3], discontinuous Galerkin [9], and flux reconstruction schemes [26]. See [7, 30] for reviews. When developing SBP-SAT schemes, one can focus on formulating and analyzing the continuous problem. Once this analysis is complete one can proceed by discretizing the continuous formulation and obtain a provably stable scheme without much difficulty, as shown in [20]. This aspect allows us to construct penalty terms in the continuous setting, leaving out specific implementation details for the variety of numerical schemes that belong to the SBP-SAT family. Hence, the focus of this paper is on constructing and analyzing penalty terms in a continuous setting without placing emphasis on particular discretization or implementation details.

This paper is organized as follows. We begin in Sect. 2 by discussing a motivating example: the coupling of wave equations. This example demonstrates how naively selected penalty weights cause stiffness. Section 3 presents the general theory for constructing non-stiff penalty terms. In particular, it is shown that the penalty terms are connected to a projection matrix. Section 4 shows that penalty terms constructed via a projection matrix formula automatically result in a dual consistent formulation. Section 5 introduces a diagnostic test to flag penalty formulations suffering from stiffness, and shows that certain penalty terms obtained by the projection matrix formula are provably non-stiff. Section 6 revisits the motivating example and applies the general theory to construct both non-stiff, energy conservative, and energy dissipative penalty terms. Section 7 introduces numerical experiments that exemplify our theoretical developments. In particular, we present a challenging 2D air-water interface problem for the wave equation. Finally, we discuss our findings in Sect. 8.

2 A motivating example

Artificial stiffness can easily emerge in coupled problems [10, 17]. In the decoupled case, each problem has some maximum stable time step \(\varDelta t_{max}^{(i)}\), for \(i=1,2\) when using a specific explicit time integrator. On the other hand, if the two subdomains are coupled together, one might obtain a different \(\varDelta t_{max}\) that can be much larger than either \(\varDelta t_{max}^{(1)}\) or \(\varDelta t_{max}^{(2)}\). This phenomenon stems from a suboptimal selection of penalty weights in the coupling procedure. We demonstrate this problem by coupling two wave equations at an interface \(\varGamma \) (see Fig. 1).

Fig. 1
figure 1

Problem setup for coupling the wave equation across an interface \(\varGamma \)

The wave equation for each side of the interface is written as a first order hyperbolic system of equations in Cartesian coordinates,

$$\begin{aligned} \partial _t q^{(i)}+ A^{(i)}_x \partial _x q^{(i)}+ A^{(i)}_y \partial _y q^{(i)}= f^{(i)}, \quad (x,y) \in \varOmega ^{(i)}, \quad t \ge 0, \end{aligned}$$
(1)

where \(q^{(i)}= ( p^{(i)} / \rho _{i}c_{i}, v^{(i)}_x, v^{(i)}_y)\), \(i=1,2\). The vector \(q^{(i)}\) collects the pressure \(p^{(i)}\) and the components of the velocity field \(v^{(i)}= (v^{(i)}_x, v^{(i)}_y)\), decomposed with respect to the Cartesian basis. Moreover,

$$\begin{aligned} A^{(i)}_x = \begin{bmatrix} 0 &{} c_{i}&{} 0 \\ c_{i}&{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{bmatrix}, \ A^{(i)}_y = \begin{bmatrix} 0 &{} 0 &{} c_{i}\\ 0 &{} 0 &{} 0 \\ c_{i}&{} 0 &{} 0 \end{bmatrix}. \end{aligned}$$

The material parameters \(\rho _{1}> 0\) and \(\rho _{2}> 0\) are the respective densities on each side of the interface. Similarly, \(c_{1}\) and \(c_{2}\) are the respective wave speeds. The forcing/penalty terms \(f^{(i)}\) are responsible for weakly imposing coupling conditions in each subdomain.

2.1 Penalty terms

A general formulation of the penalty terms is

$$\begin{aligned} f^{(i)}= \mathcal {L}(\varSigma ^{(i)}L^T u), \end{aligned}$$
(2)

where \(u = (q^{(1)}, q^{(2)})\). The role of the lifting operator \(\mathcal {L}\) is to make sure that the penalty term acts on the boundary only [1, 27]. In one dimension, it is analogous to the Dirac delta function \(\delta (x-x_0)\), where \(x_0\) is the boundary point. In higher dimensions, it is defined by

$$\begin{aligned} \int _{\varOmega } v^T\mathcal {L}(u) d\varOmega = \int _{\partial \varOmega } v^T u ds, \end{aligned}$$
(3)

for smooth and vector-valued functions u and v, and where \(\partial \varOmega \) is an appropriate boundary part of the domain \(\varOmega \). This relationship is similar to the divergence theorem in that it relates a volume/surface integral (left-hand side) to a surface/line integral (right-hand side). The matrix \(L^T\) is the boundary operator, determined by the coupling conditions, and \(\varSigma ^{(i)}\) are penalty matrices. The penalty matrices describe how to form linear combinations of the coupling conditions for each equation.

2.2 Well-posedness

For the problem to be well-posed, there are three general requirements. See [20] for further details.

Definition 1

The problem (1) subject to the weak boundary/coupling conditions (2) is well-posed if:

  1. 1.

    A solution exists,

  2. 2.

    The solution is unique,

  3. 3.

    The solution is bounded.

For existence, there cannot be too many boundary conditions, for uniqueness there cannot be too few boundary conditions, and for a bound, the boundary condition must have appropriate form. In this work, we weakly impose the minimum number of appropriate boundary/coupling conditions that bound the solution.

2.2.1 Coupling conditions

One set of well-posed and physically relevant coupling conditions for the wave equation are

$$\begin{aligned} p^{(1)}- p^{(2)}= 0, \quad v^{(1)}\cdot n - v^{(2)}\cdot n = 0 \quad \ (x,y) \in \varGamma . \end{aligned}$$
(4)

In (4), \(n = (n_x, n_y)\) is the unit normal on the interface \(\varGamma \), directed towards \(\varOmega ^{(2)}\). The coupling conditions (4) can be expressed in matrix vector form \(L^Tu\), as done in (2), where

$$\begin{aligned} L^T u&= \begin{bmatrix} p^{(1)}- p^{(2)}\\ v^{(1)}_n - v^{(2)}_n \end{bmatrix}, \ L^T = \begin{bmatrix} \rho _{1}c_{1}&{} 0 &{} 0 &{} -\rho _{2}c_{2}&{} 0 &{} 0 \\ 0 &{} n_x &{} n_y &{} 0 &{} -n_x &{} -n_y \end{bmatrix}, \end{aligned}$$
(5)

and \(v^{(1)}_n = v^{(1)}\cdot n\), \(v^{(2)}_n = v^{(2)}\cdot n\).

As previously mentioned, the penalty terms are formed by taking a linear combination of the coupling conditions (4) and distributing them among the equations, yielding

$$\begin{aligned} f^{(i)}= \mathcal {L}\left( \varSigma ^{(i)}L^T u \right) = \mathcal {L}\left( \begin{bmatrix} \sigma ^{(i)}_{11}(p^{(1)}- p^{(2)}) +\sigma ^{(i)}_{12}(v^{(1)}_n - v^{(2)}_n) \\ \sigma ^{(i)}_{21}(p^{(1)}- p^{(2)}) +\sigma ^{(i)}_{22}(v^{(1)}_n - v^{(2)}_n) \\ \sigma ^{(i)}_{31}(p^{(1)}- p^{(2)}) +\sigma ^{(i)}_{32}(v^{(1)}_n - v^{(2)}_n) \\ \end{bmatrix} \right) , \end{aligned}$$
(6)

where

$$\begin{aligned} \varSigma ^{(i)}= \begin{bmatrix} \sigma ^{(i)}_{11} &{} \sigma ^{(i)}_{12} \\ \sigma ^{(i)}_{21} &{} \sigma ^{(i)}_{22} \\ \sigma ^{(i)}_{31} &{} \sigma ^{(i)}_{32} \\ \end{bmatrix}. \end{aligned}$$

2.3 The energy method

The weights in the penalty matrices \(\varSigma ^{(1)}\) and \(\varSigma ^{(2)}\) are determined by bounding the solution via the energy method [13].

Let \(\Vert u\Vert = \sqrt{\int _{\varOmega } u^T u d\varOmega }\) denote the \(L_2\)-norm. By differentiating \(\Vert q^{(i)}\Vert ^2\) with respect to t and substituting \(\partial _t q^{(i)}\) using (1) it follows that

$$\begin{aligned} \frac{d \Vert q^{(i)}\Vert ^2}{dt} = - \int _{\varOmega ^{(i)}} 2(q^{(i)})^T\left( A^{(i)}_x \partial _x q^{(i)}+ A^{(i)}_y \partial _y q^{(i)}- f^{(i)}\right) d\varOmega . \end{aligned}$$
(7)

By applying Gauss’s theorem and the definition of the lifting operator (3), the volume integral in (7) can be converted into the following surface integral,

$$\begin{aligned} \frac{d \Vert q^{(i)}\Vert ^2}{dt} = - \int _{\varGamma } (q^{(i)})^T \tilde{A}^{(i)} q^{(i)}- 2 (q^{(i)})^T \varSigma ^{(i)}L^T u ds, \quad \end{aligned}$$
(8)

where \(\tilde{A}^{(1)} = A_x^{(1)}n_x + A_y^{(1)}n_y\), \(\tilde{A}^{(2)} = -A_x^{(2)}n_x - A_y^{(2)}n_y\). Equation (8) shows that the energy \(\Vert q^{(i)}\Vert \) is conserved in the interior, and changes in the energy rate are purely determined by fluxes across the interface. Note that in (8), the energy rate contribution from exterior boundaries have been neglected.

By summing over \(i=1,2\) in (8) and scaling each term by the positive parameters \(\alpha \) and \(\beta \), we get

$$\begin{aligned} \alpha \frac{d \Vert q^{(1)}\Vert ^2}{dt} + \beta \frac{d \Vert q^{(2)}\Vert ^2}{dt} = - \int _{\varGamma } u^T \hat{A} u - 2 u^T {\hat{\varSigma }} L^T u ds, \end{aligned}$$
(9)

where

$$\begin{aligned} \hat{A} = \text{ diag }\left( \alpha \tilde{A}^{(1)}, \beta \tilde{A}^{(2)}\right) \ \text{ and } \ {\hat{\varSigma }} = \begin{bmatrix} \alpha \varSigma ^{(1)}\\ \beta \varSigma ^{(2)}\end{bmatrix}. \end{aligned}$$

Following [8], we can prove.

Proposition 1

The solution of the problem (1) is bounded and the energy (9) is conserved if the penalty matrices in (6) are chosen as

$$\begin{aligned} \sigma ^{(i)}_{11}&= \sigma ^{(i)}_{22} = \sigma ^{(i)}_{32} = 0, \end{aligned}$$
(10)
$$\begin{aligned} \sigma ^{(i)}_{12}&= c_{i}k_{i}, \ \sigma ^{(i)}_{21} = \frac{n_x}{\rho _{i}}l_{i}, \ \sigma ^{(i)}_{31} = \frac{n_y}{\rho _{i}}l_{i}, \ \end{aligned}$$
(11)

where \(k_{i}\) and \(l_{i}\) are real parameters that satisfy the constraints

$$\begin{aligned} 1 - k_{1}- l_{1}= 0, \quad k_{1}- l_{2}= 0, \quad l_{1}- k_{2}= 0. \end{aligned}$$
(12)

Proof

A direct calculation yields,

$$\begin{aligned} u^T \hat{A} u = \alpha (q^{(1)})^T \tilde{A}^{(1)} q^{(1)}+ \beta (q^{(2)})^T \tilde{A}^{(2)} q^{(2)}= 2 \left( \frac{\alpha }{\rho _{1}}p^{(1)}v^{(1)}_n - \frac{\beta }{\rho _{2}}p^{(2)}v^{(2)}_n \right) . \end{aligned}$$

Choosing \(\alpha = \rho _{1}\) and \(\beta = \rho _{2}\), followed by expanding (9), yields

$$\begin{aligned} \begin{aligned} \sum _{i=1}^2 \frac{\rho _{i}}{2} \frac{d \Vert q^{(i)}\Vert ^2}{dt}&= - \int _{\varGamma } p^{(1)}v^{(1)}_n - p^{(2)}v^{(2)}_n - \mathcal {R} ds, \end{aligned} \end{aligned}$$
(13)

where

$$\begin{aligned} \mathcal {R}= & {} \left( \sum _{i=1}^2 \rho _{i}(\sigma ^{(i)}_{11}\frac{p^{(i)}}{\rho _{i}c_{i}} + \sigma ^{(i)}_{21}v^{(i)}_x + \sigma ^{(i)}_{31}v^{(i)}_y) \right) (p^{(1)}- p^{(2)})\nonumber \\&+ \left( \sum _{i=1}^2 \rho _{i}(\sigma ^{(i)}_{12}\frac{p^{(i)}}{\rho _{i}c_{i}} + \sigma ^{(i)}_{22}v^{(i)}_x + \sigma ^{(i)}_{32}v^{(i)}_y) \right) (v^{(1)}_n - v^{(2)}_n). \end{aligned}$$
(14)

Finally, the right-hand side in (13) vanishes if (10)–(12) holds. \(\square \)

2.4 Stiffness

Proposition 1 states sufficient conditions for obtaining an energy conservative coupling. However, the constraints (12) imply that there is one free parameter. The problem we deal with in this paper is: how should this parameter be chosen? It might be tempting to try to make all parameters share the same magnitude. For instance, as was done in [8], one can choose

$$\begin{aligned} k_{1}= k_{2}= l_{1}= l_{2}= \frac{1}{2}. \end{aligned}$$
(15)

However, as we will see next, this choice causes stiffness (as do many other choices).

Before we proceed, we need to discuss the spatial discretization of the coupled wave equation problem. The discretization of (1) in space leads to linear system of ordinary differential equations:

$$\begin{aligned} \frac{d\varvec{q}}{dt} = M_h\varvec{q}, \quad \varvec{q}= \begin{bmatrix} \varvec{p}\\ \varvec{v}\end{bmatrix}, \end{aligned}$$
(16)

where \(\varvec{p}\) and \(\varvec{v}\) discretize the pressure and velocity fields, and \(M_h\) is a sparse matrix that holds the discretization of the spatial operators and all penalty terms; applied to each side of the interface (see Sect. 5.2 for more details). Since we are only interested in understanding how the penalty parameters influence \(\varDelta t_{max}\), the specific spatial discretization technique is not important (e.g., finite difference, finite volume, discontinuous Galerkin, etc.).

The maximum stable time step of an explicit time integrator can be estimated as

$$\begin{aligned} \varDelta t_{max} = CFL \frac{h}{ c_{max} } . \end{aligned}$$
(17)

In (17), h is the grid spacing (taken to be constant and the same in each subdomain), and \(c_{max} = \max (c_{1}, c_{2})\). To prevent stiffness from occurring, \(\varDelta t_{max}\) must be no smaller than either \(\varDelta t_{max}^{(1)}\) and \(\varDelta t_{max}^{(2)}\), associated with the respective uncoupled cases. The naive parameter choice (15) violates this condition. For this choice, the spectral radius of \(M_h\), denoted \(\varrho (M_h)\), depends on the density ratio and that causes stiffness (see Fig. 2). For example, if \(\rho _{1}/\rho _{2}\approx 100\) then \(\varDelta t_{max}\) must be reduced by a factor of \(\approx 10\) compared to \(\min \left( \varDelta t_{max}^{(1)}, \varDelta t_{max}^{(2)}\right) \). Our ambition in this paper is to find a procedure that removes the kink in Fig. 2 and minimizes the spectral radius.

Fig. 2
figure 2

The maximum stable time step \(\varDelta t_{max}\) as a function of the density ratio \(\rho _{1}/ \rho _{2}\) (on a log-log scale). This plot was generated by discretizing the wave equation in (18) using \(c = c_{1}= c_{2}\), fourth order SBP finite difference operators [28], and 100 grid points in each subdomain

3 General treatment

As made apparent by the previous example, the design of penalty terms can strongly influence the efficacy of the numerical scheme. One option is to perform numerical experiments to find optimal penalty weights, but that can be cumbersome and impractical. In this section, we present a general approach that can automatically leads to non-stiff penalty weights.

We again consider the problem (1) but for simplicity focus on one subdomain, and now with general and symmetric \(A_x\) and \(A_y\) matrices:

$$\begin{aligned} \partial _t q + A_x \partial _x q + A_y \partial _y q = f, \quad (x,y) \in \varOmega , \quad t \ge 0. \end{aligned}$$
(18)

Since we are only considering one subdomain, all superscripts have been dropped. Our procedure for deriving penalty terms is inspired by the procedure developed in [17, 18]. We start with a modified ansatz of the penalty term,

$$\begin{aligned} f = \mathcal {L}\left( A \delta q \right) , \quad A = A_xn_x + A_yn_y. \end{aligned}$$
(19)

In this formulation, \(\delta q = \delta q(q)\) is an unknown penalty vector that depends on q such that a minimum number of boundary conditions are weakly imposed.

3.1 The energy method

Next, we introduce the bilinear form

$$\begin{aligned} \varPhi (q, v) = \oint _{\partial \varOmega } q^T A v ds. \end{aligned}$$
(20)

Then, the energy method applied to (18) can now be written as

$$\begin{aligned} \frac{d}{dt}\Vert q\Vert ^2 = -\varPhi (q, q) + 2\varPhi (q, \delta q). \end{aligned}$$

We can reformulate the above by completing the square, yielding

$$\begin{aligned} \frac{d}{dt}\Vert q\Vert ^2 = -\varPhi (q - \delta q, q - \delta q) + \varPhi (\delta q, \delta q). \end{aligned}$$
(21)

The solution is bounded if the right-hand side of (21) is non-positive. However, here we enforce the stronger requirement that each term on the right-hand side is individually non-positive (not just their sum):

$$\begin{aligned} -\varPhi (q - \delta q, q - \delta q) \le 0 \quad \text{ and } \quad \varPhi (\delta q, \delta q) \le 0, \end{aligned}$$
(22)

We will see that the two conditions in (22) uniquely determine \(\delta q\).

3.2 Diagonalization

Since A is symmetric, it is diagonalizable by a complete set of orthogonal eigenvectors, \(A = X\varLambda X^T\), with \(X^{-1} = X^T\). Here, X is a matrix that contains the eigenvectors arranged as column vectors, and \(\varLambda \) is diagonal matrix containing the corresponding eigenvalues. The eigenvalue matrix \(\varLambda \) is split into positive \(\varLambda ^+> 0\) and negative parts \(\varLambda ^-< 0\), and \(X = [X^+,\ X^-]\). Without loss of generality, we assume that A contains no zero eigenvalues (if it does, they are ignored).

We also define the transformation \(w = \sqrt{|\varLambda |} X^T q\), leading to

$$\begin{aligned} w^+= \sqrt{\varLambda ^+}(X^+)^T q \quad \text{ and } \quad w^-= \sqrt{|\varLambda ^-|}(X^-)^T q. \end{aligned}$$
(23)

This way of defining \(w^+\) and \(w^-\) simplifies the upcoming presentation. For the other variable, \(\delta q\), we define similar transforms, i.e.,

$$\begin{aligned} \delta w^+= \sqrt{\varLambda ^+}(X^+)^T \delta q, \quad \text{ and } \quad \delta w^-= \sqrt{|\varLambda ^-|}(X^-)^T \delta q. \end{aligned}$$
(24)

When the eigendecomposition is applied to the quadratic form \(\varPhi (q,q)\), we find

$$\begin{aligned} \varPhi (q,q) = \oint _{\partial \varOmega } q^T A q ds = \oint _{\partial \varOmega } (w^+)^T w^+- (w^-)^T w^-ds. \end{aligned}$$
(25)

3.3 Boundary conditions

Recall that the goal is to determine the penalty vector \(\delta q\) such that (22) holds subject to the boundary conditions.

3.3.1 Strong boundary conditions

For strong boundary conditions, it is natural to impose (see [20])

$$\begin{aligned} w^- = Rw^+, \quad (x,y) \in \partial \varOmega , \end{aligned}$$
(26)

where R is a rectangular matrix that imposes a minimum number of boundary conditions. From (25), we get

$$\begin{aligned} \varPhi (q,q) = \oint _{\partial \varOmega } (w^+)^T(w^+) - (w^-)^T(w^-) ds = \oint _{\partial \varOmega } (w^+)^T(I^+ - R^TR)w^+ ds. \end{aligned}$$

For strong boundary conditions, \(\delta q = 0\), the energy rate (21) becomes

$$\begin{aligned} \frac{d}{dt}\Vert q\Vert ^2 = -\varPhi (q, q). \end{aligned}$$
(27)

It follows that the right-hand side in (27) is non-positive if \(R^TR \le I^+\), where \(I^+\) is the identity matrix related to the size of \(w^+\). Clearly, this condition is satisfied when R is sufficiently small.

3.3.2 The weak primal condition

To satisfy \(\varPhi (q - \delta q, q - \delta q) \ge 0\) in (22), we proceed in the same way and specify

$$\begin{aligned} {w}^- - \delta w^- = R(w^+ - \delta w^+), \quad (x,y) \in \partial \varOmega . \end{aligned}$$
(28)

If \(R^TR \le I^+\), we obtain

$$\begin{aligned} \varPhi (q - \delta q,q-\delta q) = \oint _{\partial \varOmega } (w^+-\delta w^+)^T\left( I^+ - R^TR\right) (w^+-\delta w^+) ds \ge 0. \end{aligned}$$
(29)

3.3.3 The weak dual condition

As seen in (22), we also need \(\varPhi (\delta q, \delta q) \le 0\). To obtain that, we specify

$$\begin{aligned} \delta w^+ = \delta R \delta w^-, \quad (x,y) \in \partial \varOmega , \end{aligned}$$
(30)

where \(\delta R\) is an another rectangular matrix. If \(\delta R^T\delta R \le I^-\), we obtain

$$\begin{aligned} \varPhi (\delta q,\delta q) = - \oint _{\partial \varOmega } (\delta {w}^-)^T\left( I^- - (\delta R)^T(\delta R)\right) \delta {w}^- ds \le 0. \end{aligned}$$
(31)

Interestingly, the condition \((\delta R)^T(\delta R) \le I^-\) corresponds to the well-posedness condition for the dual problem in (18). This is further discussed in Appendix 4.

3.4 Well-posedness

The conditions (28) and (30) determine \(\delta q\) uniquely under the conditions given in the following Proposition.

Proposition 2

Let R, \(\delta R\) be the matrices in (28), (30) satisfying (29), (31). Then, if \(\det \left( I^- - R\delta R\right) \ne 0\),

$$\begin{aligned} \delta q = \left( X^+\sqrt{(\varLambda ^+)^{-1}}\delta R + X^-\sqrt{|\varLambda ^-|^{-1}}\right) \left( I^- - R \delta R\right) ^{-1}\left( w^- - Rw^+ \right) . \end{aligned}$$
(32)

Proof

The starting point is the eigenvector transformation (24) . Since \(X^T = X^{-1}\), the inverse transformation is

$$\begin{aligned} \delta q = X^+\sqrt{(\varLambda ^+)^{-1}}\delta w^+ + X^-\sqrt{|\varLambda ^-|^{-1}}\delta w^-. \end{aligned}$$

Inserting (30) into the above yields

$$\begin{aligned} \delta q = \left( X^+\sqrt{(\varLambda ^+)^{-1}}\delta R + X^-\sqrt{|\varLambda ^-|^{-1}}\right) \delta w^-. \end{aligned}$$

Also, inserting (30) into (28) yields

$$\begin{aligned} w^- = (I^- - R\delta R)\delta w^- + R w^+. \end{aligned}$$

If \(\det \left( I^- - R\delta R\right) \ne 0\), then \(\delta w^- = (I^- - R\delta R)^{-1}\left( w^- - Rw^+\right) \). Inserting this final expression into \(\delta q\) results in the formula (32). \(\square \)

Proposition 2 shows that \(\delta q\) is formulated in terms of the boundary condition \(w^- - Rw^+\). Hence, this formulation imposes a minimum number of boundary conditions, as required for well-posedness, and we have proven the following result.

Proposition 3

The problem (18) subject to the weak boundary conditions (19) is well-posed if (26), (28) and (30) hold, and

$$\begin{aligned} R^T R \le I^+ \quad \text{ and } \quad (\delta R)^T \delta R \le I^-. \end{aligned}$$
(33)

3.5 Projection matrix formula

There is an alternative procedure to determine \(\delta q\), obtained via \(\delta q = Pq\), where P is a projection matrix. By definition, a projection matrix satisfies the idempotency property \(P = P P\).

Proposition 4

Let

$$\begin{aligned} L = X^-\sqrt{|\varLambda ^-|}- X^+ \sqrt{\varLambda ^+}R^T, \quad \delta L = X^-\sqrt{|\varLambda ^-|^{-1}}+ X^+ \sqrt{(\varLambda ^+)^{-1}}\delta R. \end{aligned}$$
(34)

Then \(\delta q = Pq\), where P is the projection matrix

$$\begin{aligned} P = \delta L (L^T\delta L)^{-1} L^T. \end{aligned}$$
(35)

Proof

We start by deriving the formula for P and then show that this matrix is a projection matrix. Note that \(\delta q\) in (32) can be written as

$$\begin{aligned} \delta q = \delta L (I^- - R \delta R)^{-1} L^T q. \end{aligned}$$

Hence, it remains to show that \(L^T\delta L = I^- - R \delta R\). The result immediately follows from eigenvector orthogonality: \((X^+)^TX^- = 0\), \((X^-)^TX^+ = 0\), \((X^+)^TX^+ = I^+\), and \((X^-)^TX^- = I^-\).

For P to be a projection matrix it must satisfy \(P = P P\). We have

$$\begin{aligned} \begin{aligned} P P&= \left( \delta L (L^T\delta L)^{-1} L^T \right) \left( \delta L (L^T\delta L)^{-1} L^T \right) = P. \end{aligned} \end{aligned}$$

\(\square \)

By applying Proposition 4, the penalty term (19) can be formulated as

$$\begin{aligned} f = \mathcal {L}\left( AP q \right) . \end{aligned}$$

4 The dual problem

We show how the general formulation presented in the previous section is connected to the dual problem.

4.1 The strong dual problem

Define the functional

$$\begin{aligned} J(u, v) = \int _0^T \int _{\varOmega } u v d\varOmega dt. \end{aligned}$$

The dual of the primal problem (18) is derived from the condition \(J(q^*, f) = J(f^*, q)\), where \(f^*\) is the PDE describing the dual problem (to be determined). By inserting (18) into \(J(q^*, f)\) and applying integration by parts in time and space, we get

$$\begin{aligned} J(q^{*}, f)= & {} J(q^{*}, \partial _t q + A_x \partial _x q + A_y \partial _y q) \nonumber \\= & {} J(f^*, q) + \int _{0}^T \varPhi (q,q^{*}) dt + \int _{\varOmega } \varPsi (q,q^{*}) d\varOmega , \end{aligned}$$
(36)

where \(\varPhi (q,q^{*})\) is defined in (20), \(\varPsi (q,q^{*}) = \left[ q^T q^{*}\right] _{0}^{T}\), and

$$\begin{aligned} f^* = -\left( \partial _t q^{*}+ A_x \partial _x q^{*}+ A_y \partial _y q^{*}\right) . \end{aligned}$$

The terms \(\varPhi \) and \(\varPsi \) arising from integration by parts must vanish for the dual problem to be well-defined. The term \(\varPsi \) vanishes due to the initial condition \(u(x,0) = 0\) and end condition \(u^*(x,T) = 0\).

Next, we derive the dual boundary conditions. Following Sect. 3.2, by diagonalizing and splitting A into positive and negative parts, we get

$$\begin{aligned} \varPhi (q,q^{*}) = \oint _{\partial \varOmega } (w^+)^Tw^{+*}- (w^-)^T w^{-*}ds. \end{aligned}$$
(37)

where \(w^{+*}= (X^+)^Tq^{*}\), and \(w^{-*}= (X^-)^Tq^{*}\). The strong primal boundary condition (26) leads to

$$\begin{aligned} \varPhi (q,q^{*}) = \oint _{\partial \varOmega } (w^+)^T\left( w^{+*}- R^Tw^{-*}\right) ds. \end{aligned}$$

Hence, to make \(\varPhi \) vanish, the dual boundary conditions must be

$$\begin{aligned} w^{+*}= R^Tw^{-*}, \quad (x,y) \in \partial \varOmega . \end{aligned}$$
(38)

By introducing the time reversal transformation \(\tau = T - t\), the dual problem becomes

$$\begin{aligned} \partial _{\tau } q^{*}- A_x\partial _x q^{*}- A_y\partial _y q^{*}&= f^*, \quad (x,y) \in \varOmega , \ \, \quad \tau \ge 0, \end{aligned}$$
(39)

subject to the boundary conditions (38).

Proposition 5

The dual problem (38)–(39) is well-posed if and only if the primal problem (18), (26) is well-posed.

Proof

The energy rate for the dual problem (39) with \(f^* = 0\) is

$$\begin{aligned} \frac{d \Vert q^{*}\Vert ^2}{d\tau } = \varPhi (q^{*}, q^{*}). \end{aligned}$$

By imposing the strong dual boundary conditions (38), we get

$$\begin{aligned} \varPhi (q^{*}, q^{*}) = -\oint _{\partial \varOmega } (w^{-*})^T\left( I^- - RR^T\right) w^{-*}ds. \end{aligned}$$

Hence, the dual problem is well-posed if \(R R^T \le I^-\). On the other hand, the primal problem is well-posed if \(R^TR \le I^+\) (see Proposition 3). We can show that the two conditions are the same using the singular value decomposition (SVD) \(R = U \varSigma V^T\), where and U and V are real orthogonal matrices, and \(\varSigma \) is a diagonal matrix. By applying the SVD to R, we have \(RR^T = U \varSigma ^2 U^T\), \(R^TR = V \varSigma ^2 V^T\). Thus, both the primal and dual problems are well-posed if \(\varSigma \le I\). \(\square \)

4.2 The weak dual problem

Since the analysis of the weak dual problem is very similar to the weak primal problem, some details are omitted. To weakly impose the boundary conditions (38), we set the right-hand side of (39) to

$$\begin{aligned} f^* = -\mathcal {L}\left( A\delta q^{*}\right) . \end{aligned}$$
(40)

In (40), \(\delta q^{*}\) is an unknown penalty vector. This vector is determined in the same manner as the penalty vector \(\delta q\) of the primal problem. When the dual problem (39) uses the penalty term (40), the energy method leads to

$$\begin{aligned} \frac{d}{d\tau }\Vert q^{*}\Vert ^2 = \varPhi (q^{*}- \delta q^{*}, q^{*}- \delta q^{*}) - \varPhi (\delta q^{*}, \delta q^{*}), \end{aligned}$$

To bound the solution, we require

$$\begin{aligned} \varPhi (q^{*}- \delta q^{*}, q^{*}- \delta q^{*}) \le 0 \quad \text{ and } \quad \varPhi (\delta q^{*}, \delta q^{*}) \ge 0. \end{aligned}$$
(41)

First, consider \(\varPhi (q^{*}- \delta q^{*}, q^{*}- \delta q^{*})\) and specify

$$\begin{aligned} w^{*+}-\delta w^{*+} = R^{*}\left( w^{*-}-\delta w^{*-}\right) , \quad (x,y) \in \partial \varOmega . \end{aligned}$$
(42)

By inserting (42) into (37) and requiring \(RR^T \le I^-\), we get

$$\begin{aligned} \varPhi (q^{*}- \delta q^{*},q^{*}- \delta q^{*})= & {} -\oint _{\partial \varOmega } \left( w^{*-}- \delta w^{*-}\right) ^T\left( I^- - RR^T \right) \left( w^{*-}- \delta w^{*-}\right) ds \\\le & {} 0. \end{aligned}$$

Next, consider \(\varPhi (\delta q^{*}, \delta q^{*})\) and specify

$$\begin{aligned} \delta w^{*-}= \delta R^{*}\delta w^{*+}, \quad (x,y) \in \partial \varOmega . \end{aligned}$$
(43)

By inserting (43) into (37) and requiring \((\delta R^{*})^T \delta R^{*}\le I^+\), we get

$$\begin{aligned} \varPhi (\delta q^{*},\delta q^{*}) = \oint _{\partial \varOmega } (\delta w^{*+})^T\left( I^+ - (\delta R^{*})^T\delta R^{*}\right) \delta w^{*+}ds \ge 0. \end{aligned}$$

The results established thus far are summarized in the following proposition.

Proposition 6

The problem (39) subject to the weak boundary conditions (40) is well-posed if (38), (42)–(43) uniquely hold, and

$$\begin{aligned} RR^T \le I^- \quad \text{ and } \quad (\delta R^{*})^T (\delta R^{*}) \le I^+. \end{aligned}$$
(44)

4.3 Dual consistency

To achieve superconvergent linear functionals, it is important that the weak primal and dual problems are related to each other via an appropriate choice of penalty terms, f and \(f^*\) [21]. By substituting f and \(f^*\) in (36) using the penalty terms (19) and (40), we get

$$\begin{aligned} \varPhi (q^{*}, \delta q) = -\varPhi (q, \delta q^{*}) + \varPhi (q, q^{*}). \end{aligned}$$
(45)

To arrive at (45), we have neglected the integrals over t and set the initial and end conditions to zero. As shown in the following proposition, (45) establishes a relationship between the penalty parameters \(\delta R\) and \(\delta R^{*}\) that is analogous to the relationship between the primal (26) and dual (38) boundary conditions.

Proposition 7

The weak primal and dual problems (see Proposition 3 and 6) satisfy (45) if

$$\begin{aligned} \delta R^{*}= (\delta R)^T. \end{aligned}$$
(46)

Proof

See Appendix A for proof. \(\square \)

5 Bounded matrix norms

For spatial discretizations with all eigenvalues located in the correct half plane, the maximum stable time step is determined by the stability region of the particular time stepping scheme. For the spatial discretization to be robust, it must handle any choice of model parameters without causing unwanted growth of the spectral radius, which can be costly. To avoid computing the spectral radius, our goal in this section is to a priori analyze how the matrix norm will scale (up to a constant factor) for the spatial discretization of a particular model problem. Since this analysis relies on symbolic computation, it can identify model parameters that cause suboptimal scaling of the matrix norm. If the scaling is the same as the expected CFL-condition obtained from the uncoupled problem, we say that discretization is non-stiff. This definition will be made precise soon. In more detail, we focus on the following objectives:

  1. 1.

    To develop an easy-to-use diagnostic test that indicates how the penalty parameters influence the matrix norm of the spatial discretization.

  2. 2.

    To prove that the projection matrix formula (35) results in penalty terms that are non-stiff, in the sense discussed below.

5.1 Matrix analysis

Before proceeding, we briefly summarize key concepts and results from matrix analysis. For a matrix \(A \in {\mathbb {R}}^{m\times m}\), the 2-norm is defined by the induced vector norm

$$\begin{aligned} \Vert A\Vert _2 = \sup _{\begin{array}{c} \varvec{x}\in {\mathbb {R}}^{m} \\ \varvec{x}\ne 0 \end{array}} \frac{\Vert A \varvec{x}\Vert _2}{\Vert \varvec{x}\Vert _2}, \end{aligned}$$
(47)

where \(\Vert \varvec{x}\Vert _2 = \left( \sum _{i=1}^m x_i^2\right) ^{1/2}\). The Frobenius norm is

$$\begin{aligned} \Vert A\Vert _F = \sqrt{\text{ trace }(A^TA)}. \end{aligned}$$
(48)

The 2-norm and Frobenius norm satisfy the following properties:

$$\begin{aligned}&\varrho (A) \le \Vert A\Vert _2 \le \Vert A\Vert _F. \end{aligned}$$
(49)
$$\begin{aligned}&\Vert A B\Vert _{2} \le \Vert A\Vert _{2} \Vert B\Vert _{2}, \quad \Vert A B\Vert _{F} \le \Vert A\Vert _{F} \Vert B\Vert _{F}, \end{aligned}$$
(50)
$$\begin{aligned}&\Vert A \otimes B\Vert _{2} = \Vert A\Vert _{2} \Vert B\Vert _{2}, \quad \Vert A \otimes B\Vert _{F} = \Vert A\Vert _{F} \Vert B\Vert _{F}, \end{aligned}$$
(51)

where \(\otimes \) is the Kronecker product, and \(\varrho (A)\) is the spectral radius of A. See e.g., [12] for further properties and proofs of (49)–(51).

5.2 Model problem

As the model problem, we consider the 1D hyperbolic PDE

$$\begin{aligned} u_t + A_xu_x = \mathcal {L} (\varSigma L^T) u \end{aligned}$$
(52)

and discretize it in space with SBP operators,

$$\begin{aligned} \frac{d\varvec{u}}{dt} = M_h\varvec{u}, \quad M_h = - A_x \otimes D_x + \mathcal {L}_h \otimes \varSigma L^T. \end{aligned}$$
(53)

Here, \(\varvec{u}= [{\varvec{u}}_0 \ {\varvec{u}}_1 \ \ldots \ {\varvec{u}}_m]^T\), \({\varvec{u}}_i = [{u}_{i0} \ {u}_{i1} \ \ldots \ {u}_{iN} ]\), \({u}_{ij} = u_i(x_j,t)\), where \(x_j = jh\) and h is the grid spacing for an equidistant grid. The matrix \(D_x\) is defined by \(D_x = H_x^{-1}Q_x\), where \(H_x = H_x^T > 0\). The matrix \(Q_x\) satisfies the summation-by-parts property:

$$\begin{aligned} Q_x + Q_x^T = \varvec{e}_N\varvec{e}_N^T - \varvec{e}_0\varvec{e}_0^T. \end{aligned}$$

This property is key for proving energy stability of a SBP-based scheme. The vectors \(\varvec{e}_N\) and \(\varvec{e}_0\) are typically chosen as \(\varvec{e}_0 = [1 \ 0 \ \ldots \ 0]^T\), \(\varvec{e}_N = [0 \ \ldots \ 0 \ 1]^T\). Their role is to extract the boundary values \(u_{i0}\) and \(u_{iN}\) from \(\varvec{u}\). For further details about SBP-based discretizations, see [7, 30], and the references therein.

To simplify the forthcoming analysis, we assume that we only need to apply a penalty term on the left boundary, such that \(\mathcal {L}_h\) in (53) becomes

$$\begin{aligned} \mathcal {L}_h = H^{-1}_x \varvec{e}_0\varvec{e}_0^T. \end{aligned}$$

Moreover, we assume that \(D_x\) is a consistent approximation of the first derivative such that it differentiates a constant exactly: \(D_x \varvec{1}= 0\) for \(\varvec{1}= [1 \ 1 \ \ldots \ 1]^T\).

Before we can prove that the discretization (53) is non-stiff for appropriately chosen penalty parameters, we give the following definition of non-stiffness.

Definition 2

The semi-discrete approximation (53) of (52) is non-stiff if there exists a positive constant \(\gamma \), independent of the model parameters, such that

$$\begin{aligned} h \varrho (M_h) \le \gamma \varrho (A_x). \end{aligned}$$

5.3 Diagnostic test

Our objective is to decide how \(h\Vert M_h\Vert _F\) (or the 2-norm) scales for a particular choice of penalty terms without performing any numerical computations. This capability is useful because it enables one to perform a diagnostic test and assess the robustness of a particular weak coupling procedure without having to implement it and compute the spectral radius for a range of model parameters. If the test passes, then the weak coupling procedure is provably non-stiff. If the test fails, it indicates that the coupling procedure is sensitive to certain choices of model parameters. Our test consists of the following four steps.

  1. 1.

    Determine \(\varSigma \) such that the problem is energy stable and the penalty parameters \(\sigma _{ij}\) are parameterized by the model parameters (e.g., \(a_{ij}\) in \(A_x\), and \(l_{ij}\) in L).

  2. 2.

    Compute the spectral radius \(\varrho (A_x)\)

  3. 3.

    Compute the norm of the penalty matrix \(\Vert \varSigma L^T\Vert _F\) .

  4. 4.

    Check if there exists a constant \(\gamma \), independent of the model parameters, such that: \(\Vert \varSigma L^T\Vert _F \le \gamma \rho (A_x)\).

  5. 5.

    If the test fails, repeat it for the 2-norm.

  6. 6.

    If the test still fails, redesign the coupling procedure to be provably non-stiff.

We recommend checking the Frobenius norm first, because it is much easier to compute than the 2-norm. To compute the 2-norm one must symbolically compute the singular values of the penalty matrix.

The diagnostic test is motivated by the following assumption and Proposition.

Assumption 1

For finite sized (fixed N) matrices, \(D_x\) and \(\mathcal {L}_h\), there exists constants \(s_{d,l}, r_{d,l} > 0\) such that

$$\begin{aligned} r_d \le h\Vert D_x\Vert _2 \le s_d, \quad r_l \le h\Vert \mathcal {L}_h\Vert _2 \le s_l . \end{aligned}$$
(54)

Proposition 8

The matrix M of the spatial discretization (53) satisfies

$$\begin{aligned} r_l \Vert \varSigma L^T \Vert _{2} \le h\Vert M_h\Vert _{2} \le s_d \rho (A_x) + s_l \Vert \varSigma L^T\Vert _{2}. \end{aligned}$$
(55)

Proof

See Appendix B for proof. \(\square \)

Remark 1

If there are multiple penalty terms present in \(M_h\), we use the reverse triangle inequality on each term to obtain a lower bound, i.e.,

$$\begin{aligned} \Vert A + B\Vert \ge \left| \Vert A\Vert - \Vert B\Vert \right| . \end{aligned}$$

Likewise, we use the triangle inequality to obtain an upper bound.

5.3.1 Suboptimal scaling

As an illustration, consider the motivating example presented in Sect. 2.

  1. 1.

    From (6), the penalty matrix on each side of the interface is

    $$\begin{aligned} \varSigma ^{(i)}L^T = \begin{bmatrix} 0 &{} c_{i}k_{i}n_x &{} c_{i}k_{i}n_y &{} 0 &{} c k_{i}n_x &{} c_{i}k_{i}n_y \\ \frac{Z_1}{\rho _{i}} l_{i}n_x &{} 0 &{} 0 &{} -\frac{Z_2}{\rho _{i}} l_{i}n_x &{} 0 &{} 0 \\ \frac{Z_1}{\rho _{i}} l_{i}n_y &{} 0 &{} 0 &{} -\frac{Z_2}{\rho _{i}} l_{i}n_y &{} 0 &{} 0 \\ \end{bmatrix}. \end{aligned}$$
  2. 2.

    The spectral radius is \(\varrho (A^{(i)}_x) = c_{i}\), and \(\max \varrho (A_x^{(i)}) \le c_{max}\), where \(c_{max} = \max (c_{1}, c_{2})\).

  3. 3.

    The Frobenius norm of each penalty matrix is

    $$\begin{aligned} \Vert \varSigma ^{(1)}L^T \Vert _F&= \frac{1}{\rho _{1}}\sqrt{c_{1}^2\rho _{1}^2\left( 2k_{1}^2 + l_{1}^2\right) + c_{2}^2\rho _{2}^2l_{1}^2}, \\ \Vert \varSigma ^{(2)}L^T \Vert _F&= \frac{1}{\rho _{2}}\sqrt{c_{2}^2\rho _{2}^2\left( 2k_{2}^2 + l_{2}^2\right) + c_{1}^2\rho _{1}^2l_{2}^2}. \end{aligned}$$
  4. 4.

    For the naive choice of parameters (15), we get

    $$\begin{aligned} \Vert \varSigma ^{(1)}L^T \Vert _F \le \frac{c_{max}}{2}\sqrt{3 + \frac{\rho _{2}^2}{\rho _{1}^2}} \quad \text{ and } \quad \Vert \varSigma ^{(2)}L^T \Vert _F \le \frac{c_{max}}{2}\sqrt{3 + \frac{\rho _{1}^2}{\rho _{2}^2}}. \end{aligned}$$

    The presence of \(\rho _{2}/\rho _{1}\) causes linear growth in \(h\Vert M\Vert _F\) for \(\rho _{2}/\rho _{1}\ll 1\) and \(\rho _{2}/\rho _{1} \gg 1\).

  5. 5.

    Since we have failed to bound the penalty matrices in the Frobenius norm, we proceed to check the 2-norm:

    $$\begin{aligned} \Vert \varSigma ^{(1)}L^T \Vert _2= & {} \text{ max }\left( \frac{c_{1}}{2\sqrt{2}}, \frac{\sqrt{c_{1}^2\rho _{1}^2 + c_{2}^2\rho _{2}^2}}{2 \rho _{1}} \right) \quad \text{ and } \\ \Vert \varSigma ^{(2)}L^T \Vert _2= & {} \text{ max }\left( \frac{c_{2}}{2\sqrt{2}}, \frac{\sqrt{c_{1}^2\rho _{1}^2 + c_{2}^2\rho _{2}^2}}{2 \rho _{2}} \right) . \end{aligned}$$

    The presence of \(\rho _{1}/\rho _{2}\) also causes linear growth in \(h\Vert M\Vert _2\). Hence, we cannot rule out that the spectral radius scales suboptimally as a function of the model parameters. In fact, the numerical experiment presented in Fig. 2 shows unwanted growth of the spectral radius in \(\rho _{1}/\rho _{2}\).

  6. 6.

    Since we have failed to prove that this coupling is non-stiff, we will redesign it using the projection matrix formula.

5.4 Provably non-stiff penalty treatment

As is shown in the following proposition, the projection matrix formula yields a useful estimate for well-posed linear hyperbolic problems. For certain choices of the penalty parameters \(\delta R\), the method is provably non-stiff.

Proposition 9

Let \(\varSigma L^T = AP\), where A is given in (19) and P is computed using the projection matrix formula (35). Then

$$\begin{aligned} \Vert \varSigma L^T\Vert _2 \le \frac{(1 + \Vert R\Vert _2)(1 + \Vert \delta R\Vert _2)}{\Vert I^- - R\delta R \Vert _2} \rho (A). \end{aligned}$$
(56)

Proof

See Appendix C for proof. \(\square \)

If the denominator \(\Vert I^- - R\delta R\Vert _2\) approaches zero, the norm of \(h\Vert M\Vert _2\) scales suboptimally. There are at least two choices of \(\delta R\) that prevent the denominator from vanishing for all R. For the energy dissipative choice \(\delta R = 0\), (56) yields

$$\begin{aligned} \Vert \varSigma L^T\Vert _2 \le (1 + \Vert R\Vert _2)\rho (A) \le 2\rho (A). \end{aligned}$$

For the energy conservative choice \(\delta R = -R^T\), (56) yields

$$\begin{aligned} \Vert \varSigma L^T\Vert _2 \le \frac{(1 + \Vert R\Vert _2)^2}{\Vert I^- + RR^T\Vert _2}\rho (A) \le 4\rho (A). \end{aligned}$$

In each case, we used \(\Vert R\Vert _2 \le 1\), which is required by well-posedness.

6 Revisiting the motivating example

To demonstrate how to derive dual consistent and non-stiff penalty terms by example, we apply it to the motivating example presented in Sect. 2.

We summarize our procedure and carry out each step at a time.

  1. 1.

    (The penalty formulation) Formulate the penalty terms such that the energy rate can be written similarly to (21). Since we are studying the coupled problem (1), we need to scale the energy rate as done in (9),

    $$\begin{aligned} \alpha \frac{d}{dt}\Vert q^{(1)}\Vert ^2 + \beta \frac{d}{dt}\Vert q^{(2)}\Vert ^2 = -\varPhi (u, u) + 2\varPhi (u, \delta u), \ \varPhi (u, v) = \oint _{\varGamma } u^T A v ds, \end{aligned}$$

    where \(u = (q^{(1)}, q^{(2)})\), \(A = \text{ diag }(\alpha \tilde{A}^{(1)}, \beta \tilde{A}^{(2)})\), and \(\delta u\) is to be determined.

  2. 2.

    (Diagonalization) Diagonalize A and construct \(w^+\) and \(w^-\) using (23).

  3. 3.

    (Coupling conditions) Rewrite the coupling conditions (4) in the form of \(w^- = Rw^+\) subject to the constraint \(R^TR \le I^+\) (see Proposition 3).

  4. 4.

    (Penalty parameters) Choose some \(\delta R\) subject to the constraint \( (\delta R)^T (\delta R) \le I^-\) (see Proposition 3).

  5. 5.

    (Projection matrix formula) Determine \(\delta u = Pu\) using the projection matrix formula defined in Proposition 4.

  6. 6.

    (Matrix norms) Following Sect. 5, compute the matrix norms \(\Vert AP / \alpha \Vert _F\) and \(\Vert AP / \beta \Vert _F\) to verify that the formulation is provably non-stiff.

1. We formulate the penalty terms for the respective sides of the interface as

$$\begin{aligned} f^{(1)}= \mathcal {L}\left( \frac{1}{\alpha } \begin{bmatrix} I&0 \end{bmatrix} A \delta u \right) \quad \text{ and } \quad f^{(2)}= \mathcal {L}\left( \frac{1}{\beta } \begin{bmatrix} 0&I \end{bmatrix} A \delta u \right) , \end{aligned}$$
(57)

This formulation of the penalty terms leads to

$$\begin{aligned} \int _{\varOmega ^{(1)}} \alpha (q^{(1)})^Tf^{(1)}d\varOmega + \int _{\varOmega ^{(2)}} \beta (q^{(2)})^Tf^{(2)}d\varOmega = \int _{\varGamma } u^T A \delta u ds. \end{aligned}$$

We use \(\alpha = \rho _{1}\) and \(\beta = \rho _{2}\), as these values are necessary for obtaining an energy balance when imposing the coupling conditions (4).

2. Next, following Sect. 3.2, we diagonalize A to obtain

$$\begin{aligned} \int _{\varGamma } u^T A u ds = \int _{\varGamma } (w^+)^T w^+ - (w^-)^T(w^-) ds. \end{aligned}$$
(58)

In (58), \(w^+ = (w^{+(1)}, w^{+(2)})\), \(w^- = (w^{-(1)}, w^{-(2)})\), where

$$\begin{aligned} w^{\pm (1)} = \frac{\sqrt{Z_{1}}}{\sqrt{2}}\left( \frac{p^{(1)}}{Z_{1}} \pm v_n^{(1)}\right) , \quad w^{\pm (2)} = \frac{\sqrt{Z_{2}}}{\sqrt{2}}\left( \frac{p^{(2)}}{Z_{2}} \mp v_n^{(2)}\right) , \end{aligned}$$

\(Z_{1}= \rho _{1}c_{1}\), and \(Z_{2}= \rho _{2}c_{2}\).

3. The coupling conditions (4) need to be converted into the form \(w^- = Rw^+\). After some algebra, one obtains the solution

$$\begin{aligned} R(Z_{1},Z_{2}) = \frac{1}{Z_{1}+ Z_{2}} \begin{bmatrix} Z_{2}- Z_{1}&{} 2\sqrt{Z_{1}Z_{2}} \\ 2\sqrt{Z_{1}Z_{2}} &{} Z_{1}- Z_{2}\end{bmatrix}. \end{aligned}$$
(59)

We verify that the matrix (59) strictly satisfy (29). A direction calculation yields \(R^TR = I_2\), where \(I_2\) is the \(2 \times 2\) identity matrix. Hence, R is an orthogonal matrix.

4. Next, we choose \(\delta R = -R^T\). This choice results in energy conservation by (29) and (31), since \(R^TR = I_2\) and therefore \(\delta R (\delta R)^T = I_2\).

5. We use the projection matrix formula in Proposition 4 for \(\delta R = -R^T\), to obtain the penalty vector

$$\begin{aligned} \delta u = Pu = \frac{1}{Z_{1}+ Z_{2}} \begin{bmatrix} p^{(1)}- p^{(2)}\\ Z_{2}n_x\left( v^{(1)}_n - v^{(2)}_n\right) \\ Z_{2}n_y\left( v^{(1)}_n - v^{(2)}_n\right) \\ p^{(2)}- p^{(1)}\\ Z_{1}n_x\left( v^{(2)}_n - v^{(1)}_n\right) \\ Z_{1}n_y\left( v^{(2)}_n - v^{(1)}_n\right) \\ \end{bmatrix}. \end{aligned}$$
(60)

After inserting (60) into (57), the final penalty terms become

$$\begin{aligned} f^{(i)}_{ec} = \mathcal {L}\left( \frac{1}{\rho _{i}}APu\right) = \mathcal {L}\left( \frac{1}{\rho _{i}(Z_{1}+ Z_{2})} I^{(i)} \begin{bmatrix} Z_{1}Z_{2}\left( v^{(1)}_n - v^{(2)}_n\right) \\ Z_{1}n_x\left( p^{(1)}- p^{(2)}\right) \\ Z_{1}n_y\left( p^{(1)}- p^{(2)}\right) \\ Z_{1}Z_{2}\left( v^{(1)}_n - v^{(2)}_n\right) \\ Z_{2}n_x\left( p^{(1)}- p^{(2)}\right) \\ Z_{2}n_y\left( p^{(1)}- p^{(2)}\right) \\ \end{bmatrix} \right) , \end{aligned}$$
(61)

where \(I^{(1)} = [I \ 0], \ I^{(2)} = [0 \ I]\).

6. Following the steps laid out in the diagnostic test in Sect. 5, we arrive at

$$\begin{aligned} \left\| \frac{1}{\rho _{1}} AP \right\| _F\le & {} \frac{c_{max}}{2} \sqrt{\frac{\rho _{1}^2 + 3\rho _{2}^2}{(\rho _{1}+ \rho _{2})^2}} \le \frac{\sqrt{3}}{2}c_{max}, \quad \left\| \frac{1}{\rho _{2}} AP \right\| _F\\\le & {} \frac{c_{max}}{2} \sqrt{\frac{3\rho _{1}^2 + \rho _{2}^2}{(\rho _{1}+ \rho _{2})^2}} \le \frac{\sqrt{3}}{2}c_{max}, \end{aligned}$$

since \(\rho _{1}> 0, \rho _{2}> 0\). Hence, this formulation is provably non-stiff in the sense of Definition 2.

Alternately, if we repeat steps 1-5 for the energy dissipative choice \(\delta R = 0\), the penalty terms become

$$\begin{aligned} f^{(i)}_{ed} = \mathcal {L}\left( \frac{1}{\rho _{i}(Z_{1}+ Z_{2})} I^{(i)} \begin{bmatrix} Z_{1}\left( Z_{2}(v^{(1)}_n - v^{(2)}_n) - \left( p^{(1)}- p^{(2)}\right) \right) \\ Z_{1}n_x \left( -Z_{2}(v^{(1)}_n - v^{(2)}_n) + \left( p^{(1)}- p^{(2)}\right) \right) \\ Z_{1}n_y \left( -Z_{2}(v^{(1)}_n - v^{(2)}_n) + \left( p^{(1)}- p^{(2)}\right) \right) \\ Z_{2}\left( Z_{1}(v^{(2)}_n - v^{(1)}_n) + \left( p^{(2)}- p^{(1)}\right) \right) \\ Z_{2}n_x \left( Z_{1}(v^{(2)}_n - v^{(1)}_n) + \left( p^{(2)}-p^{(1)}\right) \right) \\ Z_{2}n_y \left( Z_{1}(v^{(2)}_n - v^{(1)}_n) + \left( p^{(2)}- p^{(1)}\right) \right) \end{bmatrix} \right) . \end{aligned}$$
(62)

6.1 General penalty term formulation

By inspecting (61)–(62) we can formulate the following general penalty terms.

$$\begin{aligned} f^{(i)}&= \mathcal {L}\left( \frac{1}{\rho _{i}} I^{(i)} \begin{bmatrix} Z_{1}k_1 \left( v^{(1)}_n - v^{(2)}_n\right) \\ n_x k_2 \left( p^{(1)}- p^{(2)}\right) \\ n_y k_2 \left( p^{(1)}- p^{(2)}\right) \\ Z_{2}k_2 \left( v^{(1)}_n - v^{(2)}_n\right) \\ n_x k_1 \left( p^{(1)}- p^{(2)}\right) \\ n_y k_1 \left( p^{(1)}- p^{(2)}\right) \\ \end{bmatrix} + \frac{\zeta }{\rho _{i}} I^{(i)} \begin{bmatrix} - k_2 \left( p^{(1)}- p^{(2)}\right) \\ - n_x k_1 Z_{1}\left( v^{(1)}_n - v^{(2)}_n \right) \\ - n_y k_1 Z_{1}\left( v^{(1)}_n - v^{(2)}_n \right) \\ k_1\left( p^{(2)}- p^{(1)}\right) \\ n_x k_2 Z_{2}\left( v^{(2)}_n - v^{(1)}_n \right) \\ n_y k_2 Z_{2}\left( v^{(2)}_n - v^{(1)}_n \right) \end{bmatrix} \right) , \nonumber \\ \zeta&\ge 0. \end{aligned}$$
(63)

In (63), \(\zeta \) controls the amount of energy dissipation, and \(k_1, k_2\) are listed in Proposition 1, and they are here defined as

$$\begin{aligned} k_1 = \frac{Z_{2}^m}{Z_{1}^m + Z_{2}^m} \quad \text{ and } \quad k_2 = \frac{Z_{1}^m}{Z_{1}^m + Z_{2}^m}, \end{aligned}$$
(64)

where m is a free parameter. This formulation recovers all other formulations.

  • The naive choice (15) corresponds to

    $$\begin{aligned} \quad m = 0 \quad \text{ and } \quad \zeta = 0. \end{aligned}$$
    (65)
  • The energy conservative choice, \(\delta R = -R^T\), corresponds to

    $$\begin{aligned} m = 1 \quad \text{ and } \quad \zeta = 0. \end{aligned}$$
    (66)
  • The energy dissipative choice, \(\delta R = 0\), corresponds to

    $$\begin{aligned} m = 1 \quad \text{ and } \quad \zeta = 1. \end{aligned}$$
    (67)

To illustrate the results, we revisit the spectral radius investigation presented in Sect. 2.4. The energy conservative and dissipative choices (66)–(67) each cause the spectral radius to be independent of the density ratio, as shown in Fig. 3.

Fig. 3
figure 3

The normalized spectral radius \(h \varrho (M_h) / c\) as a function of the density ratio \(\rho _{1}/ \rho _{2}\) (on a log-log scale)

7 Numerical experiments

In this section, we conduct several numerical experiments. These experiments demonstate the importance of having a scheme with bounded matrix norm for robustness, and show the predictive ability of Propositions 89. In addition, we show that linear functionals are superconverging. Finally, we investigate advantages and disadvantages of developing a near-energy conservative scheme (in both space and time) of the wave equation in two spatial dimensions.

7.1 Matrix norm behavior

Consider the wave equation in a single domain on the unit interval \(0 \le x \le 1\), subject to the boundary conditions

$$\begin{aligned}&w^-(0,t) = 0, \end{aligned}$$
(68)
$$\begin{aligned}&p(1,t) = \alpha Z v(1,t), \alpha > 0. \end{aligned}$$
(69)

The left boundary condition (68) is implemented by taking \(R = 0\) and \(\delta R = 0\) in the projection matrix formula (Proposition 4). By Proposition 3, the problem is well-posed and energy dissipative. We consider two different well-posed implementations of the right boundary condition, described next.

7.1.1 Naive choice

The right boundary condition can be implemented naively by applying the penalty term

$$\begin{aligned} f_{x=1} = \mathcal {L}\left( \varSigma (p - \alpha Z v) \right) , \end{aligned}$$

where \(\int u^T\mathcal {L}(v) dx = u(1,t)v(1,t)\). The penalty matrix is chosen as \(\varSigma = [ 0, \ 1/\rho ]^T\), which results in the energy dissipation rate

$$\begin{aligned} \rho \frac{\Vert q\Vert ^2}{dt} = -\alpha Z v^2(1,t) \le 0. \end{aligned}$$

(The energy rate contribution from the left boundary has been neglected.) We write the penalty term as \(f_{x=1} = \mathcal {L}(\varSigma L^T q)\), where

$$\begin{aligned} \varSigma L^T = \begin{bmatrix} 0 &{} 0 \\ c &{} -\alpha c \end{bmatrix}. \end{aligned}$$

By Proposition 8, this penalty term causes supoptimal scaling, since:

$$\begin{aligned} \Vert \varSigma L^T\Vert _F = c\sqrt{1 + \alpha ^2}. \end{aligned}$$

Hence, for \(\alpha \gg 1\) the growth rate of \(h\Vert M_h\Vert \) is linear in \(\alpha \). We confirm this prediction by numerically computing \(\Vert M_h\Vert _2\) as well as lower and upper bounds on it by applying Proposition 8. We discretize in space by applying the fourth order SBP operators [28] using 100 grid points. The constants in Proposition 8 are numerically computed to be:

$$\begin{aligned} s_d = h\Vert D_x\Vert _2 \approx 2.359 \quad r_l = s_l = h\Vert \mathcal {L}^h\Vert _2 \approx 2.824. \end{aligned}$$

Figure 4 show that the norm \(h\Vert M_h\Vert _2\) grows linearly as predicted when \(\alpha \) increases.

Fig. 4
figure 4

Behavior of \(\Vert M_h\Vert _2\) as a function of the boundary parameter \(\alpha \) in (69). Lower bounds (LB) and upper bounds (UB) are given by Proposition 8

7.1.2 Projection matrix formula

The right boundary condition (69) can also be implemented by applying the projection matrix formula (Proposition 4). In this case, \(R = (\alpha - 1) / ( 1 + \alpha )\), whereas \(\delta R\) is a free, but bounded parameter. We investigate how \(\delta R\) influences \(h\Vert M_h\Vert _2\) for a fixed \(\alpha = 10\) and vary \(\delta R\) within its stability limit \(-1 \le \delta R \le 1\). Figure 5 shows that any \( \delta R \le 0\) has negligible impact on \(h\Vert M_h\Vert _2\). On the other hand, when \(\delta R \rightarrow 1\), \(h\Vert M_h\Vert _2\) grows rapidly. Proposition 9 explains this growth because \(\Vert I^- - R\delta R\Vert \ll 1\) when \(\delta R \rightarrow 1\).

Fig. 5
figure 5

Behavior of \(\Vert M_h\Vert _2\) as a function of the penalty parameter \(\delta R\). Lower bounds (LB) and upper bounds (UB) are given by Proposition 8

7.2 Superconvergence and dual consistency

SBP-SAT schemes that are dual consistent exhibit superconvergence for linear functionals. As explained in Sect. 4, a necessary condition for dual consistency is that the weak primal boundary conditions are related to the weak dual boundary conditions in a certain way. Interestingly, Proposition 7, shows that all penalty terms constructed using the projection matrix formula satisfy this condition. We numerically verify this result by constructing both energy conservative and energy dissipative penalty treatments that result in superconvergent linear functionals.

Consider again the wave equation in a single domain \(0 \le x \le 1\). We use the method of manufactured solutions to construct the solution

$$\begin{aligned} p(x,t) = \cos (kt) \sin (kx), \quad v(x,t) = - \sin (kt)\cos (kx). \end{aligned}$$
(70)

The manufactured solution (70) satisfies the wave equation subject to the initial and boundary conditions:

$$\begin{aligned} p(x,0)&= \sin (k x), \quad&v(x,0)&= 0, \\ p(0,t)&= 0, \quad&v(1,t)&= -\sin (kt)\cos (k). \nonumber \end{aligned}$$
(71)

These boundary conditions correspond to \(R = -1\) at both \(x=0\) and \(x=1\). The manufactured solution (70) results for example in the linear functionals

$$\begin{aligned} F(p) = \int ^1_0 p dx = \frac{\cos (k) - 1}{k}\cos (kt), \quad F(v) = \int ^1_0 v dx = -\frac{\sin (k)}{k}\sin (kt). \end{aligned}$$
(72)

As before, the boundary conditions are implemented using the projection matrix formula. We compare the energy dissipative choice \(\delta R = 0\) against the energy conservative choice \(\delta R = -1\). Besides this parameter choice, all other settings are the same. In (70), we use \(k = 8 \pi \). We advance in time using a fourth order Runge–Kutta method until the final time \(T = 1.2\) using the timestep \(\varDelta t = h/4\) (the wave speed c is set to \(c=1\) in this case). This experiment uses fourth order SBP operators, and the expected convergence rate is third order in the variables p, and v according to [29, 31]. The linear functionals (72) should superconverge at fourth order rate [14]. Tables 1 and 2 lists the errors and convergence rates for the energy dissipative and energy conservative penalty parameter choices, respectively. As expected, both choices result in superconverging functionals.

Table 1 Errors and convergence rates when using an energy dissipative penalty parameter value \(\delta R = 0\)
Table 2 Errors and convergence rates when using the energy conservative penalty value \(\delta R = 1\)

7.3 Energy conservation versus energy dissipation

For energy conservative problems, it may be beneficial to design a numerical scheme that conserves energy in a semi-discrete sense. Weak boundary conditions that have dissipation (\(\gamma > 0\) in (63)) cause the spectrum of the spatial discretization to have eigenvalues with a real part. The amount of dissipation controls the distance of the real part of the eigenvalues to the imaginary axis. Since some Runge-Kutta methods have a stability region that is wider along the imaginary axis compared to the real axis, the presence of dissipation can have a negative impact on the maximum stable time step. Staggered time-stepping methods, such as leap-frog and the fourth order staggered Runge–Kutta method [11] have stability regions that are primarily confined to the imaginary axis. On the other hand, they have smaller truncation error and larger stability region along the imaginary axis compared to their non-staggered counterparts. For instance, the fourth order staggered Runge–Kutta method has about a factor of two larger stability region along the imaginary axis, and a factor of 16 smaller truncation error, compared to the classical fourth order Runge–Kutta scheme. However, to allow for dissipation may improve the accuracy of a numerical scheme. To understand how these benefits and drawbacks influence performance, we compare the computational efficiency of energy conserving discretizations advanced in time with the fourth order staggered Runge–Kutta (SRK4) scheme against energy dissipating discretizations advanced in time using the fourth order, 5-4 solution 3, 2N low-storage Runge–Kutta method (RK4) [15].

To investigate the performance and demonstrate the importance of how the coupling parameters are chosen, we present a challenging application problem featuring a light-dense media interface. The interface is located at \(y = 0\) and the light medium rests on top of the dense medium. The material properties in the light medium are \(\rho _{1}= 1\), \(c_{1}= 1\), and they are \(\rho _{2}= 800\), \(c_{2}= 4\) in the dense medium. These parameter values result in an impedance contrast of \(Z_{2}/Z_{1}= 3200\), which is representative of a water-air interface. This large impedance contrast makes the problem computationally challenging. Figure 6 shows a coarse grid simulation of the pressure field for a fixed time. An explosive source initiated in the air medium sends out waves that both reflect against and transmit across the air-water interface (black horizontal line).

Fig. 6
figure 6

Pressure wave field at t = 0.4, computed on a coarse grid (\(\lambda _{min}/h \approx 6\)) with the energy dissipative coupling (67)

The air and water media are each discretized by the fourth order SBP staggered finite difference operators presented in [23]. Initially, all fields are set to zero. To initiate the simulation, we use a singular source term with a prescribed source time function. This source term is written as \(\delta (x-x_s)\delta (y-y_s)g(t)\) and acts in the right-hand side of the pressure equation. The source is positioned in close vicinity of the interface, at \((x_s, y_s) = (0.0, 0.05)\). The delta functions \(\delta (x-x_s)\) and \(\delta (y-y_s)\) are discretized in a line by line manner to fourth order accuracy by imposing moment conditions, see [25] for details. As the source time function, g(t), we use the Ricker wavelet

$$\begin{aligned} g(t) = \left( 1 - 2\pi ^2f_p^2(t - t_0)^2\right) e^{-\pi ^2f_p^2(t-t_0)^2}, \end{aligned}$$

where \(f_p = 6.4\) and \(t_0 = 0.25\). These settings result in \(\lambda _{min}/h \approx 6\) grid points per minimum wavelength in the air medium. This estimate defines \(\lambda _{min} = c_{min}/f_{max}\), where \(f_{max}\) is the frequency at 5% of the peak amplitude of the Ricker wavelet in the Fourier domain.

Since we are mostly interested in understanding how the coupling treatment influences accuracy at the interface, we measure the error along it. Due to the discontinuous nature of the weak coupling conditions, the error is defined by the jump in a quantity that is continuous across the interface. We choose the jump in pressure. To approximate the solution anywhere along the interface, we use cubic Lagrange interpolation. In particular, we focus on measuring the error in the middle of the interface, at \((x_r, y_r) = (0.0, 0.0)\).

7.3.1 Accuracy

Before investigating the computational efficiency of each method, we compare the accuracy of the energy conservative and energy dissipative couplings obtained by the projection matrix formula (35). When solving the air-water interface problem on the coarse grid with the energy conservative coupling (66), numerical artifacts emerge. These artifacts are large-amplitude spurious oscillations that propagate along the interface. Figure 7a shows the presence of these oscillations at the same time and on the same grid as shown in Fig. 6. While these spurious oscillations vanish with grid refinement and appear to be confined to the interface, they prevent one from performing coarse grid computations that demand a reasonably (less than 10% error) accurate solution at the interface. Fortunately, we can suppress these spurious oscillations by selecting a different \(\delta R\) in the projection matrix formula.

Fig. 7
figure 7

Qualitative accuracy comparison of energy conserving couplings. The energy conservative coupling (66) causes noticeable spurious oscillations on coarse grids

To appreciate what causes the spurious oscillations to develop, recall that the interface problem couples a light medium to a dense medium. This problem has a large impedance contrast \(Z_{2}/Z_{1}\gg 1\) across the interface. In this limit, we study the reflection and transmission of plane wave solutions that are incident normal to the interface. The following solution solves (1),

$$\begin{aligned} p^{(1)}&=-Z_{1}e^{i\tilde{k}_1(y + c_{1}t)} + a_RZ_{1}e^{i\tilde{k}_1(-y + c_{1}t)}, \\ v_x^{(1)}&= 0, \ v_y^{(1)} =e^{i\tilde{k}_1(y + c_1 t)} + a_Re^{i\tilde{k}_1(-y + c_1 t)} \\ p^{(2)}&=-Z_{2}a_T e^{i\tilde{k}_2(y + c_{2}t)},\ v^{(2)}_x = 0,\ v^{(2)}_y = a_Te^{i\tilde{k}_2(y + c_2 t)}, \end{aligned}$$

where \(\tilde{k}_1\), \(\tilde{k}_2\) are wavenumbers that satisfy the dispersion relation \(\omega = \tilde{k}_1 c_1 = \tilde{k}_2 c_2\), and \(a_R\), \(a_T\) are reflection and transmission coefficients, determined by the coupling conditions (4). The reflection and transmission coefficients are:

$$\begin{aligned} a_R = \frac{Z_{1}- Z_{2}}{Z_{1}+ Z_{2}} \quad \text{ and } \quad a_T = \frac{2Z_{1}}{Z_{1}+ Z_{2}}. \end{aligned}$$

In the limit when \(Z_{2}/Z_{1}\rightarrow \infty \), we get \(a_R = -1\), \(a_T = 0\), and there is no transmission into the dense medium. An interpretation of this limit is that a wave propagating in the light medium senses the interface as a rigid wall. On the other hand, a wave propagating in the dense medium senses the interface as a free surface, and transmits into the light medium with double the amplitude. If we neglect any transmission, the problem decouples into two problems with the boundary conditions

$$\begin{aligned} v^{(1)}_n = 0\quad (\text{ light}), \quad p^{(2)}= 0\quad (\text{ dense}), \quad (x,y) \in \varGamma . \end{aligned}$$

The only way to implement these boundary conditions in an energy conserving manner is to set

$$\begin{aligned} m \rightarrow \infty \quad \text{ and } \quad \zeta = 0, \end{aligned}$$
(73)

in (63)–(64). The choice (73) results in \(k_1 = 1\), \(k_2 = 0\), and (63) yields

$$\begin{aligned} f^{(1)} = \mathcal {L}\left( \frac{1}{\rho _{1}} \begin{bmatrix} Z_{1}\left( v^{(1)}_n - v^{(2)}_n\right) \\ 0 \\ 0 \\ \end{bmatrix} \right) , \quad f^{(2)} = \mathcal {L}\left( \frac{1}{\rho _{2}} \begin{bmatrix} 0 \\ n_x \left( p^{(1)}- p^{(2)}\right) \\ n_y \left( p^{(1)}- p^{(2)}\right) \\ \end{bmatrix} \right) , \end{aligned}$$
(74)

where \(v^{(2)}_n=p^{(1)}=0\) in the decoupled case.

In the limit when \(Z_{2}/Z_{1}\rightarrow \infty \), the energy conservative coupling (66) approaches the penalty terms of (74) at a linear rate. We speculate that this rate is too slow when there is no artificial dissipation present to suppress the spurious oscillations. Instead, we choose \(\delta R\) so that it corresponds to (74) by directly taking the limit (we do not modify \(Z_{2}, Z_{1}\) in the problem itself)

$$\begin{aligned} \delta R = \lim _{Z_{2}/Z_{1}\rightarrow \infty } -R^T(Z_{1},Z_{2}) = \begin{bmatrix} -1 &{} 0 \\ 0 &{} 1 \\ \end{bmatrix}. \end{aligned}$$
(75)

Obviously, if \(Z_{1}> Z_{2}\) we would have taken the opposite limit \(Z_{2}/Z_{1}\rightarrow 0\) instead. Note that the choice (75) preserves the energy conservation property because \(\delta R\) is still an orthogonal matrix. Since we used the projection matrix formula to derive this coupling, it is also both dual consistent and provably non-stiff.

This simple modification of the penalty parameter results in a dramatic improvement in accuracy of the coarse grid simulation (Fig. 7b). However, the dissipative choice \(\delta R = 0\) in Fig. 6 is the most accurate option.

Figure 8 shows the error in pressure at the center grid point of the interface as a function of time for each type of coupling. The modified energy conservative coupling (73), SRK4-EC2, reduces the maximum amplitude of the error by a factor of 18 more than the coupling (66), SRK4-EC1. While this simple modification resulted in a dramatic improvement for this particular problem, we have not investigated other problems with moderate impedance contrasts.

Fig. 8
figure 8

Error in pressure in the middle of the interface. See Table 3 for label descriptions

7.3.2 Time step selection

We determine the maximum stable time step \(\varDelta t_{max} = \text{ CFL } h / c_{max}\) for each type of coupling and time stepping method. In each case, we maximize \(\text{ CFL }\) by performing several coarse grid computations using the bisection method. Table 3 lists CFL numbers for each method up to three decimal places.

Table 3 CFL numbers for each type of time stepping method and coupling used in the investigation

As expected from the 1D investigations, the naive coupling, RK4-NA, results in more than an order magnitude reduction in \(\nu \) compared to the other couplings. Interestingly, SRK4-EC1 and SRK4-EC2 result in a factor of four larger CFL compared to RK4-ED. We believe this improvement is due to two reasons: 1. SRK4 has a factor of \(\approx \) 2 larger stability region along the imaginary axis compared to RK4. 2. The dissipative part of (67) leads to eigenvalues with real parts that are \(\approx 2\) smaller than the RK4 stability envelope along the imaginary axis. By decreasing \(\zeta \) in (63) it should be possible to increase CFL, but most likely at the expense of decreasing the accuracy in the solution.

7.3.3 Computational efficiency

With each method tuned to run at its time step stability limit, we compare their computational efficiency (error as a function of computational time). The norm of the error is the maximum amplitude of the error in time in the middle of the interface (see Fig. 8). Figure 9 shows the computational efficiency of each method. The naive method, RK4-NA, performs the worst, running more than an order of magnitude slower than the other methods. The computational efficiency of SRK4-EC is limited by the large errors caused by the inaccurate coupling. On coarser grids, SRK4-EC2 performs significantly faster than RK4, but the trend reverses on sufficiently fine grids. RK4-ED is the most accurate method on all grids. While not shown, we also tested RK4-EC2 and that resulted in an almost indistinguishable difference in error compared to SRK4-EC2. To model the speedup of SRK4-EC2 compared to RK4-ED, we apply a cubic polynomial least square fit to each data series. Using this model, we observe that the maximum performance increase is about 150 % in favor of SRK4-EC2 for 10% error. Due to the improved accuracy of the dissipative coupling, this performance increase rapidly diminishes. At an error level of 0.01% the performance increase is completely saturated. In conclusion, the energy conservative method outperforms the energy dissipative method on coarse grids due to better stability properties. On the other hand, the energy dissipative method outperforms the energy conservative method on fine grids due to better accuracy properties (Fig. 10).

Fig. 9
figure 9

Computational efficiency of the SRK4 and RK4 schemes

Fig. 10
figure 10

Modeled speedup of SRK4-EC2 compared to RK4-ED (speedup \( > 1\) SRK4-EC2 is faster than RK4-ED)

To determine computational timings, we implemented each method in a similar manner using C++ and CUDA. The computational timing experiments were measured on a Nvidia GTX 2080 Ti card. No efforts were made to optimize the performance of any of the implementations.

8 Conclusions

When weakly coupling hyperbolic partial differential equations, one must select certain penalty parameters that describe how the coupling conditions are weighted. Within the SBP community, it is well-established how to constrain these parameters to give a proof of semi-discrete stability of a numerical scheme via the energy method. However, the energy method alone cannot fully constrain all of the parameters. The remaining parameters must be carefully selected because they can have a striking impact on stiffness and accuracy of the numerical scheme.

We showed the importance of this claim by simulating the interaction of waves across an air-water interface in an energy conserving manner. For one set of parameter values, the coupling treatment caused the numerical scheme to be an order of magnitude stiffer than expected from the CFL condition from the uncoupled case. Another set of parameter values prevented stiffness, but instead caused a degradation in accuracy. This accuracy degradation showed the development of spurious waves in the vicinity of the interface. In this study, we explained what causes stiffness and developed a general coupling procedure that is provably non-stiff and accurate.

To overcome stiffness in an automated manner, we presented a general formulation for designing penalty terms. This general formulation results in stable and dual consistent schemes that are provably non-stiff. In this formulation, the penalty terms are related to projection matrices that are guaranteed to exist as long as a determinant condition is satisfied. In the limit when the determinant approaches zero, the matrix norm grows without bounds. We gave two examples for which the determinant never vanishes; these examples are associated with either an energy conservative or an energy dissipative coupling.

A potential advantage of the energy conservative coupling compared to the energy dissipative coupling is that it is compatible with energy conservative time stepping schemes. For this reason, we compared the computational efficiency of a fourth order staggered Runge–Kutta with the energy conservative coupling against a fourth order Runge–Kutta with the energy dissipative coupling. We showed that energy conserving methods can outperform energy dissipative methods on coarse grids because they allow larger time steps. However, energy dissipative methods outperform energy conservative methods on fine grids because they have better accuracy properties. For the air-water interface problem, we observed a factor of 2.5 speedup of the energy conservative method compared to the energy dissipative method on a coarse grid.

There is one important aspect that we did not address: what mechanism of the penalty parameters controls accuracy? For the energy-conservative penalty parameters for the air-water interface problem, we saw that a simple change in a parameter value resulted in an order of magnitude reduction in the error level on the coarsest grid. We linked this accuracy improvement to what the coupling treatment must become in the limit that the impedance contrast approaches infinity. Ideally, we would like to have a simple diagnostic test, analogous to the stiffness test, that can establish if a set of parameters will cause an accurate interface treatment or not. It would also be useful to have an automated and general procedure that can select these parameters, thereby avoiding the cumbersome work of manually deriving and testing parameters on a case by case basis.