An Explicit Mapped Tent Pitching Scheme for Maxwell Equations

We present a new numerical method for solving time dependent Maxwell equations, which is also suitable for general linear hyperbolic equations. It is based on an unstructured partitioning of the spacetime domain into tent-shaped regions that respect causality. Provided that an approximate solution is available at the tent bottom, the equation can be locally evolved up to the top of the tent. By mapping tents to a domain which is a tensor product of a spatial domain with a time interval, it is possible to construct a fully explicit scheme that advances the solution through unstructured meshes. This work highlights a difficulty that arises when standard explicit Runge Kutta schemes are used in this context and proposes an alternative structure-aware Taylor time-stepping technique. Thus explicit methods are constructed that allow variable time steps and local refinements without compromising high order accuracy in space and time. These Mapped Tent Pitching (MTP) schemes lead to highly parallel algorithms, which utilize modern computer architectures extremely well.


Introduction
Electromagnetic waves propagate at the speed of light. Thus, the field at a certain point in space and time depends only on field values within a dependency cone. A tent pitching method introduces a special "causal" spacetime mesh that respects this finite speed of propagation. It is not limited to Maxwell equations, but can be applied to general hyperbolic equations. A tent pitching method requires a numerical scheme to discretize the equation on that mesh. Discontinuous Galerkin (DG) methods are of particular interest since they offer a systematic avenue to build high order methods. For a given initial condition at the bottom of a tent, the discrete equations may be solved within each individual tent, up to the tent top. The computed solution at the tent top provides initial conditions for the tents that follow later in time. This method is highly parallel, since many tents can be solved independently. Methods using such tent-pitched meshes may be traced back to [5,7]. More recent works [1,6,8] develop Spacetime DG (SDG) methods within tents by formulating local variational problems, for which linear systems are set up and solved. Although these systems are local, the matrix size can grow rapidly with the polynomial order, especially in four-dimensional spacetime tents. In this context it is natural to ask if one can develop explicit schemes (which usually perform well under low memory bandwidth) that take advantage of tents.
A key ingredient to answer this question was presented in [2], where Mapped Tent Pitching (MTP) schemes were introduced. The MTP discretization, which proceeds by mapping tents to a spacetime cylinder, allows one to evolve the solution either implicitly or explicitly within tents. The memory requirements of the explicit MTP scheme are limited to what is needed for storing the spatial mesh, the solution coefficients at one time step, and the topology of the tents.
In this work, we show that notwithstanding the above-mentioned advantages of the explicit MTP scheme, one may lose higher order convergence if a naive time stepping strategy (involving a standard explicit Runge-Kutta scheme) is used. We then develop a new Taylor time-stepping for the local problems within tents. Despite its simplicity, our numerical experiments show that it delivers optimal order of convergence.

Mesh generation by tent pitching
We start with a conforming spatial mesh consisting of elements T = {T } and vertices V = {V }. We progress in time by defining a sequence of advancing fronts τ i . A front τ i is given as a standard nodal finite element function on this mesh. It is defined by storing the current time for every vertex of the mesh. We move from τ i to the next front τ i+1 by moving one vertex forward in time, while keeping all other vertices fixed. The spacetime domain between τ i and τ i+1 we call a tent. In Fig. 1, the red domain is the tent between τ i and τ i+1 .
Its projection to the spatial domain is exactly the vertex patch ω V around V of the original mesh. The data to be stored for one tent are the bottom and top-times of the central vertex, plus the times for all neighboring vertices.
Note that although the algorithm is described sequentially, it is highly parallel. Vertices with graph-distance of at least two can be moved forward independently. For example, in Fig. 1, all blue tents can be built and processed in parallel.
The distance for advancing a vertex is limited by the speed of light, a constraint often referred to in the literature as the causality condition. Under this condition, the Maxwell problem inside the tent is solvable using the initial conditions at the tent bottom. Thus, the top boundary is an outgoing boundary and no boundary conditions are needed there.
Note that the spatial mesh is refined towards the right boundary, which leads to smaller tent heights at the right boundary. Hence, smaller time steps in locally refined regions is a very natural feature of tent pitching methods.

The MTP discretization
Now, we consider the discretization method for one tent domain where ω V is the union of elements containing the vertex V , and ϕ b and ϕ t are the bottom and top fronts, respectively, restricted to ω V . Our aim is to numerically solve the Maxwell system on K, namely where boundary values for both fields are given at the tent bottom and ∇ = ∇ x denotes the spatial gradient. The approach of MTP schemes is to map the tent domain to a spacetime cylinder ω V × (0, 1) and solve the transformed equation there. The transformation from the cylinder to the tent is denoted by Φ : It is similar to the Duffy transformation mapping a square to a triangle. With the notation we can rephrase the curl operator as ∇ × E = div skew E, where the divergence of the matrix function is taken row-wise. To simplify notation further, we define u : K → R 6 by u = (E, H), and set g : K → R 6 and f : Then (1) may be rewritten as the conservation law ∂ t g(u) which allows us to write Maxwell's system (1) as the spacetime conservation law For each row of F, the spacetime divergence div x,t sums the spatial divergence of the first three components with the time-derivative of the last component. Now, we apply the Piola transformation to pull back F from the tent K to the cylinder using the mapping Φ. The derivative of Φ and its transposed inverse are Since the Piola transform provides an algebraic transformation of the divergence, equation (3) is simply transformed to div x,t F(û) = 0 on the spacetime cylinder. Then, inserting the Jacobian of Φ leads us to the transformed equation where is the local height of the tent. Note that ∇ϕ is an affinelinear function in quasi-timet. Equation (4) describes the evolution ofû along quasitime fromt = 0 tot = 1. Details of the calculations are given in [2]. The next step is the space discretization of (4) by a standard discontinuous Galerkin method. Let V h ⊂ [L 2 ] 6 be the DG finite element space of degree p on T . On each tent we search forû : holds for all v h ∈ V h and allt ∈ [0, 1]. Only the restriction of V h on the patch ω V is used in this equation. The numerical flux f n (û + ,û − ) depends on the positive trace lim s→0 +û(x + sn) and negative trace lim s→0 +û(x − sn), where n is a unit normal vector of arbitrary orientation to the face. The jump is defined as usual by û := u + −û − and the mean value by {û} := 1 2 (û + +û − ). One example is the upwind flux [3, p. 434] Note that the local tent height δ enters the boundary integrals as a multiplicative factor. At the outer boundary of the vertex patch we have δ = 0, so the facet integrals on the outer boundary disappear. For the above semidiscrete system, initial values for the tent problem are given finite element functions at the tent bottom. The finite element solution on the tent top provides the initial conditions for the next level tent. Therefore, no projection of initial values is needed when propagating from one tent to the next. After the semi-discretization, as usual, we are left to solve a system of N = dimV h (ω V ) ordinary differential equations for U : given U(0). The non-standard feature of (5) is that M is an affine-linear function of the quasi-timet (since our mapping enters the mass matrix M through ∇ϕ). The matrix A is independent oft. A straightforward approach is to substitute Y = MU and solve d dt instead of (5). Although first order convergence was observed with this strategy, further numerical studies showed reduced order of convergence if the stage-order of the Runge Kutta (RK) method is not high enough -see Fig. 3 (right). While the implicit MTP schemes discussed in [2] do not show this problem, the issue remains critical for explicit schemes. Thus, we propose to use a new type of explicit timestepping for time discretization, discussed next.

Structure-aware Taylor time-stepping
Returning to the ordinary differential equation (5) and continuing to make the substitution Y = MU, we now reconsider the previous equation as the following differential-algebraic system: We begin by subdividing the interval (0, 1) into m ∈ N smaller intervals of size 1 m , defined by (t i ,t i+1 ) = ( i m , i+1 m ), for i ∈ N and 0 ≤ i ≤ m − 1. Recall that A is independent of quasi-timet, and M is an affine function oft, i.e., where M i = M(t i ) and the derivative M is a constant matrix. We want to design a time-stepping scheme that is aware of this structure.
Consider the approximations to Y,U on (t i ,t i+1 ) in the form of Taylor polynomials Y i ,U i of degree q, defined by where Y i,n = Y  i (t i ). To find these derivatives, we differentiate both equations of (6) n times to get For the second equation we used Leibnitz' formula ( f g) (n) = ∑ n i=0 n i f (i) g (n−i) , and the fact that M is affine-linear. Evaluating these equations for the Taylor polynomials Y i ,U i att =t i , we obtain a recursive formula for Y i,n and U i,n in terms of U i,n−1 , namely for all 0 ≤ i ≤ m − 1. Given Y 0,0 = Y (t 0 ), M 0 U 0,0 = Y 0,0 , applying (8) with i = 0 gives the approximate functions Y 0 (t),U 0 (t) in the first subinterval (t 0 ,t 1 ). The recursive formulas are initiated for later subintervals at n = 0 by After the final subinterval, we get Y m−1 (t m ), our approximation to Y (1). We shall refer to the new time-stepping scheme generated by (8) as the q-stage SAT (structureaware Taylor) time-stepping. Note that Y m−1 (t m ) is our approximation to Y = MU at the top of the tent. This value is then passed to the next tent in time. The time dependence of M arises from the time dependence of ∇ϕ. This gradient is continuous along spacetime lines of constant spatial coordinates. Therefore, when passing from one element of a tent to the same element within the next tent in time, Y is continuous (since the solution U is continuous). Of course, on flat fronts ∇ϕ = ∇τ = 0, so there M is just a diagonal matrix containing the material parameters.
To briefly remark on the expected convergence rate of a q-stage SAT timestepping, recall that due to the mapping of the MTP method we solve forû = u • Φ, which satisfies ∂ n tû = δ n (∂ n t u) • Φ. The causality condition implies that δ → 0 if the mesh size h → 0. Thus we may expect the n th temporal derivative ofû, and correspondingly U (n) , to go to zero at the rate O(h n ). By using a q-stage SAT timestepping, we approximate the first q − 1 terms of the exact Taylor expansion of U. Thus we expect the convergence rate to be O(h q ), the size of the remainder term involving U (q) . The next section provides numerical evidence for this. Before concluding this section, we should note that in (8) and (9), we tacitly assumed that M i is invertible. Let us show that this is indeed the case whenever the causality condition (see §2) |∇ϕ| < √ ε µ is fulfilled. At any quasi-timet, given â w = (ŵ E ,ŵ H ) ∈ V h whose coefficient vector in the basis expansion is W ∈ R N , consider the equation M(t)U = W for the coefficient vector U ofû ∈ V h . This equation, in variational form, is Let a(û,v) denote the left hand side of (10). To prove solvability of (10), it suffices to prove that a(·, ·) is a coercive bilinear form on [L 2 ] 6 for anyt. By inserting g(û) = [εÊ, µĤ] T and f (û) = [− skewĤ, skewÊ] T into a(û,û), where we used the Cauchy-Schwarz inequality and inserted √ ε and √ µ to achieve the desired scaling. By applying Young's inequality and |∇ϕ| < √ ε µ, form some constant C > 0. Thus M i is invertible and the SAT time-stepping is well defined on all tents respecting the causality condition. One may exploit the specific details of the Maxwell problem to avoid the assembly and the inversion of matrices M i (as we have done in our implementation). In fact, instead of (10), we can explicitly solve the corresponding exact undiscretized equation obtained by replacing V h by [L 2 ] 6 in (10). The solutionû = (Ê,Ĥ) in closed form readsÊ We then perform a projection of these into V h to obtain the coefficients U(t i ). For uncurved elements, this just involves the inversion of a diagonal mass matrix. For the small number of curved elements, we use a highly optimized algorithm which uses an approximation instead of the exact inverse mass matrix.

Numerical Results
The MTP discretization in combination with the SAT time-stepping on tents is implemented within the Netgen/NGSolve finite element library. In this section numerical results concerning accuracy as well as performance are reported.

Convergence studies in two space dimensions
We consider the model problem in two space dimensions Parameters are set ε = µ = 1 such that speed of light is c = 1. Initial and boundary values are set such that the exact solution is given by , Based on a spatial mesh with mesh size h, we generate a tent pitched mesh such that the maximal slope |∇ϕ| is bounded by (2c) −1 and apply a discontinuous Galerkin method in space using polynomials of order p, with 1 ≤ p ≤ 4. On each cylinder we perform a (p + 1)-stage SAT time-stepping with m = 2p intervals. The spatial L 2 error of all field components at the final time is reported in the left plot of Fig. 3. We observe that the error goes to zero at the optimal rate of O(h p+1 ) until we are close to machine precision.
In contrast, the right plot in Fig. 3 illustrates the previously mentioned loss of convergence rates when the classical Runge-Kutta method is used. The convergence rates stagnate at first order no matter what p is used. A similar behavior was also observed for other explicit Runge-Kutta methods.

Large scale problem in three space dimensions
As a second example we present a simulation on a domain similar to the resonator shown in [4]. The geometry is given as body of revolution of smooth B-spline curves. The mesh consisting of 489593 curved tetrahedral elements is shown in Fig. 4. Due to higher curvature the mesh is refined along the inner roundings, where the ratio of the largest to the smallest element is approximately 5:1. We used a Gaussian peak (located at the axis of revolution and the position of the fifth inner rounding) for the electric field as initial data. The explicit MTP scheme with SAT time-stepping then computed the solution at t = 260 using time slabs of height 1, with each slab composed of N tents = 149072 tents. On each tent we used a (p + 1)stage SAT time-stepping with m = 2p intervals, where p denotes the spatial polynomial order. With the spatial degrees of freedom N dof,i of the i th tent and the number of stages q = p + 1, we obtain the total spacetime degrees of freedom per time slab The corresponding numbers of degrees of freedom and the simulation times are shown in Table 1. In [4] a similar problem is solved using a discontinuous Galerkin method with quadratic elements, combined with a polynomial Krylov subspace method in time. Using 96 cores it took them 7:10 hours to reach the final time.
Our simulation with polynomial order p = 3, which has a comparable number of unknowns, took 3:33 hours on 64 cores. This significant speed up is an illustration of the capability of the new method. The H y component of the obtained solution at t = 260, using third order polynomials in space, is shown in Fig. 4.
number of spatial dof 2.938 × 10 7 5.875 × 10 7 number of spacetime dof per slab 1.908 × 10 9 7.632 × 10 9 simulation time per slab 4.6 s 49.2 s total simulation time 20 min 3 h 33 min Table 1 Number of degrees of freedom and simulation times for spatial polynomial orders p = 2, 3. This data was generated using a shared memory server with 4 E7-8867 CPUs with 16 cores each.