An Anisotropic hp-mesh Adaptation Method for Time-Dependent Problems Based on Interpolation Error Control

We propose an efficient mesh adaptive method for the numerical solution of time-dependent partial differential equations considered in the fixed space-time cylinder Ω×(0,T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varOmega \times (0,T)$$\end{document}. We employ the space-time discontinuous Galerkin method which enables us to use different meshes at different time levels in a natural way. The mesh adaptive algorithm is based on control of the interpolation error in the L∞(0,T;Lq(Ω))\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^\infty (0,T; L^q(\varOmega ))$$\end{document}-norm. The goal is to construct a sequence of conforming triangular meshes in such a way that the interpolation error bound is under a given tolerance and the number of degrees of freedom is minimal. The resulting grids consist of anisotropic mesh elements with varying polynomial approximation degrees with respect to space. We present a theoretical framework of this approach as well as several numerical examples demonstrating the accuracy, efficiency, and applicability of the method.

necessary to achieve a given error tolerance [3,5,11,32,45]. The mesh adaptive methodology is well established for time-independent problems, where the current mesh is adapted based on suitable error estimates of the available approximate solution. This process is repeated several times until the required accuracy is achieved. If any inaccuracy appears during the computational process, it can be compensated by the computations on next adaptive levels, i.e., the error does not propagate. On the other hand, the numerical solution of time-dependent problems is more complicated since the computation is performed step-by-step and any inaccuracy propagates in the physical time.
Among the most efficient adaptive techniques belong the hp-methods which admit refinement (and coarsening) of the mesh elements and variation of the local polynomial approximation degrees. Under some assumptions, an exponential rate of convergence of the computational error with respect to the number of degrees of freedom can be achieved [4,12,32,38,40].
Moreover, the so-called anisotropic mesh adaptation techniques exhibit an efficient tool for the solution of problems containing interior or boundary layer and/or line discontinuities, see, e.g., [1, 8, 29-31, 44, 48] and the references mentioned therein. For a review, we refer to [34,46]. In contrast to the standard mesh refinement method, where the mesh elements are merely split (isotropically and/or anisotropically), the current grid is completely re-meshed.
In recent years, we have developed the anisotropic hp-mesh adaptation method for the numerical solution of time-independent boundary value problems, which combines both aforementioned approaches [14,19,20]. This technique offers sufficient flexibility in the minimization of the number of degrees of freedom (and reduction of the computational time) necessary to achieve a given error tolerance. In particular, we derived interpolation error estimates employing the geometry of mesh elements which are used for the local optimization of the element shape and the polynomial approximation degrees. Further, using the so-called continuous mesh and error models (cf. [19,29,30]) the size of mesh elements is optimized, and the corresponding metric field, used for the mesh construction, is defined. For further details, see the references given above.
The extension of the anisotropic hp-mesh adaptation technique to the solution of timedependent problems is not straightforward. The use of non-matching and non-nested, possibly anisotropic meshes at different time levels can produce inaccuracies, which propagate in time and can degrade the accuracy of the approximate solution. An adaptive finite element method employing isotropic but non-nested grids was proposed in [35] for a linear parabolic problem. In [27], a piecewise linear finite element approximation with anisotropic mesh adaptation controlling the error in L ∞ (0, T ; L q (Ω))-norm was proposed for the simulation of an unsteady bi-fluid model. Further, these techniques were developed and applied mostly to unsteady flow simulation, see, e.g., [2,6,26]. Finally, additional aspects of the adaptive methods for time-dependent problems on unstructured grids were developed, e.g., [9,10,37]. For theoretical aspects, we refer [7] and the references cited therein.
In this paper, we develop an anisotropic hp-mesh adaptive method for the numerical solution of the time-dependent partial differential equation written the form where Ω ⊂ R 2 is the computational domain, T > 0 is the time to be reached and w : Ω × (0, T ) → R n , n ≥ 1 is the sought unknown function. Moreover, ∂ t denotes the partial derivative with respect to t ∈ (0, T ), L is a (possibly nonlinear) differential operator acting on w and f : Ω ×(0, T ) → R n is a source term. Finally, ϑ : R n → R n is a function given by the physical model. In many cases it is an identity (i.e., ϑ(w) = w), but one of the numerical examples presented below addresses the simulation of variably saturated flow through a porous medium, where ϑ is a nonlinear function of w. We assume that Ω is polygonal, for simplicity. Equation (1) is accompanied by the initial condition w(x, 0) = w 0 (x), x ∈ Ω, where w 0 : Ω → R n is a prescribed function. Moreover, (1) has to be equipped with suitable boundary conditions depending on the properties of L . We discretize (1) by the space-time discontinuous Galerkin method, which treats different meshes at different time levels in a natural way. We adopt the aforementioned anisotropic hpmesh adaptation technique to the solution of (1). The resulting space-time adaptive scheme admits hp-adaptation in space, as well as the adaptive choice of the size of the time step, but the time polynomial approximation degree is kept fixed. However, an adaptive choice of the time polynomial degree is possible in principle. We demonstrate the potential of the proposed scheme by a set of numerical experiments. In particular, we show that the use of nonmatching, non-nested anisotropic hp-meshes at different time instances does not negatively affect the accuracy of numerical solution.
The content of the paper is the following. In Sect. 2 we describe the mesh optimization process for a given sufficiently regular function defined on a space domain Ω. We present a theoretical framework leading to the hp-variant of the error equi-distribution, which is directly used in the adaptive algorithm. This is the first novelty of the paper. In Sect. 3 we briefly recall the discontinuous Galerkin discretization of problem (1). The main novelty is given in Sect. 4, where we extend our mesh adaptation technique to the numerical solution of (1). Several implementation aspects are presented and discussed. The performance of the proposed algorithm is demonstrated by several experiments in Sect. 5. A summary of the results is given in Sect 6.

Anisotropic hp-mesh Optimization Process for a Given Function
Let Ω ⊂ R 2 be the computational domain. In this section, we briefly describe the mesh optimization process for a given, sufficiently regular function w : Ω → R. In this discussion, time dependence plays no role. The proposed method draws from our previous work on mesh optimization methods, based on interpolation error control using continuous mesh models. More details on this theoretical framework, including the possible extension to the 3D case, can be found in our recent monograph [18].
By T h p p p = {T h , p p p}, we denote an hp-mesh of Ω, where T h = {K } is a conforming grid of Ω consisting of triangles K with mutually disjoint interiors, and the set of integers p p p = {p K ∈ I, K ∈ T h } represents the polynomial approximation degrees for each K ∈ T h . In general, elements K ∈ T h are anisotropic but hanging nodes are not admitted.
To each hp-mesh T h p p p there exists the unique space of discontinuous piecewise polynomial functions where P p K (K ) is the space of polynomial functions over K ∈ T h p p p with total degree at most p K . The dimension of S h p p p is given by dim S h p p p = K ∈T h ( p K + 1)( p K + 2)/2. This value is called the number of degrees of freedom (DoF). Let C ∞ (Ω) denote the space of infinitely differentiable functions. We introduce a projection Π h p p p : C ∞ (Ω) → S h p p p such that where α = (α 1 , α 2 ) is a multi-index, D α denotes the partial derivative of degree |α| = α 1 +α 2 and x K is the barycenter of x K , K ∈ T h . The projection Π h p p p w is easy to construct element-wise and the difference w − Π h p p p w is called the interpolation error. We are ready to formulate the main problem of this section.

Problem 1
Let w ∈ C ∞ (Ω) be a given function and let ω > 0 be a given tolerance. We seek an hp-mesh such that where · denotes the Lebesgue norm for some q ∈ [1, ∞].
The existence (but not the uniqueness) of the solution of Problem 1 is guaranteed by Zorn's lemma [28]: Since the set of hp-meshes satisfying (i) is nonempty and dim S h p p p > 0, this set has a minimal element. However, the practical search of an (approximate) solution of Problem 1 is not an easy task. One possible solution is the local setting of an optimal shape of mesh elements and the determination of the element size distribution using the continuous mesh and error models. These steps are briefly described in the rest of this section.

Interpolation Error Function and its Anisotropic Bound
We approximate the interpolation error w − Π h p p p w on K ∈ T h by the Taylor polynomial of degree p K + 1 evaluated at the barycenter x K = (x K ,1 , x K ,2 ). Hence, we define the interpolation error function at x K by Obviously, e w x K , p K approximates w − Π h p p p w on K ∈ T h and also on its neighbourhood (including the whole domain Ω). According to [14,Lemma 3.12], we have the following result: There exist A w > 0, ρ w ≥ 1 and ϕ w ∈ [0, 2π] such that where ) is a 2 × 2 diagonal matrix and Q ϕ w is a rotation matrix through the angle ϕ w . The triple {A w , ρ w , ϕ w } is called the anisotropic bound of the interpolation error e w x K , p K at x K , and it depends on the partial derivatives of w at x K appearing in (4). This triple can be determined by the procedure described in [14,Section 3.2].

Remark 1
The values A w , ρ w and ϕ w depend only on the derivatives of degree p K + 1 of w at x K . Therefore, it is possible to evaluate them at any x ∈ Ω and for any integer p K , hence we write and call them the anisotropic bound functions. They are employed in Sect. 2.5 in the framework of the continuous error model.

Geometry of Mesh Element
LetK denote an equilateral reference triangle having the barycenter at the origin of the coordinate system and letK be circumscribed by a unit circle. For each K ∈ T h there exists an affine mapping F K which mapsK on K and can be written as where Q φ K and Q ψ K are rotation matrices through the angles φ K and ψ K , respectively, and is a diagonal matrix with the singular values K ,1 ≥ K ,2 > 0. We define the geometry of triangle K by the triplet and φ K is the angle of rotation from (7). We note that μ K ∼ |K | 1/2 where |K | is the area of K ∈ T h .

Interpolation Error Estimates Employing the Geometry of Mesh Elements
We formulate the estimate of the interpolation error function which takes into account the geometry of mesh elements. For brevity, we consider the case 1 ≤ q < ∞, the case q = ∞ is easier to treat and we refer to [18,Chapter 3].
and e w x K , p K be the corresponding interpolation error function (4) with the anisotropic bound {A w , ρ w , ϕ w } at the barycenter x K satisfying (5). Then Proof For the proof we refer to [18,Lemma 3.21] or [20,Lemma 5.6], where the case q = 2 is treated. The idea is to integrate the q-power of (5) over K , transform the integral to the referenceK and bound the integral overK by the integral over the circumscribed unit ball. By a direct computation, we obtain (9).
The minimization of function G from (10) with respect to σ K and φ K allows us to find the optimal shape of triangle K with fixed area which minimizes the interpolation error bound in the L q -norm. In particular, we have the following result: [18,Theorem 5.4] Let 1 ≤ q < ∞, K be a triangle with size μ K and barycenter x K . Let w ∈ P p K and e w x K , p K be the corresponding interpolation error function (4) with the anisotropic bound {A w , ρ w , ϕ w } at x K satisfying (5). Then triangle K , having the size μ K and minimizing the upper bound of the interpolation error from Lemma 1, has the geometry Moreover, the corresponding interpolation error bound is

Continuous Mesh Model
The idea of the continuous mesh model was first introduced in [29,30] for the h-variant of anisotropic mesh adaptation for piecewise linear approximation. It was extended to higher order piecewise polynomial approximation in [36] and to the hp-version in [19] . Let T h p p p be an hp-mesh characterized by the values {μ K , σ K , φ K , p K }, K ∈ T h , cf. (8). The idea of the continuous mesh model is to define a continuous analogue of T h p p p by the functions such that We call the function μ the element size distribution and p the polynomial degree distribution. Further, we define the complexity of the continuous mesh by the integrand of the previous integral exhibits the density of DoF.

Remark 2
Using (14), we have from (15) where we have used the fact that that |K | ∼ μ 2 K . The value on the right-hand side of (15) is equal to the dimension of the space S h p p p corresponding to T h p p p .

Continuous Error Model
In the same spirit we introduce the continuous error model related to the interpolation error estimate (12). Let p : Ω → [1, ∞) be the polynomial degree distribution function from (13). In virtue of Remark 1, for w ∈ C ∞ (Ω), we set functions (6), where the symbol [·] denotes the nearest integer of its argument. In particular, for x ∈ Ω, the values A w (x) and ρ w (x) are computed by the same procedure as the values in estimate (5) for x K := x and p K : Then, we define the total continuous interpolation error estimate by where and μ : Ω → (0, ∞) is the element size distribution function, cf. (14). (14), we have from (17), (18) and (12) the relations

hp-mesh Optimization Problem and its Solution
The continuous interpolation error estimate (17) depends on the element size and polynomial degree distributions μ and p, respectively. We recall that the estimate (12) (and therefore (17)) makes sense if the mesh is locally optimized, i.e., (11) is valid. Now, we formulate the continuous analogue of the mesh-optimization Problem 1.
In principle, Problem 2 can be solved by the tools of constrained optimization. However, the presence of the unknown function p in the exponent of E (cf. (18)) prevents us from finding an analytical solution, see Appendix. Therefore, in [19], we proposed a semi-analytical solution of Problem 2, which consists of two steps that can be repeated until a (pseudo-)convergence is achieved: (S1) fix the polynomial distribution function p and, by the tools of variational calculus, find an optimal distribution of the element size distribution μ; (S2) having function μ, modify locally the polynomial approximation degree p K for each K ∈ T h by selecting the degree giving the minimal error estimate for fixed density of DoF, cf. Sect. 2.6.2.
The step (S1) is resolved by the following Lemmas.
Proof We set the Lagrangian corresponding to (P2) with constraint (P1) as where 0 = λ ∈ R is the Lagrange multiplier. Function μ is the solution of Problem 2 (with fixed p) if Then, from (21) - (22), we obtain by differentiation The right-hand side of (23) is equal to 0 for all perturbationsμ if which together with (15) implies (20) since λ is a constant.
A consequence of Lemma 3 is the following result, which can be employed in practical realization:

Lemma 4 Let p be the polynomial distribution function and μ be the element size distribution function satisfying (20). Then the interpolation error is equi-distributed over Ω in the sense
Proof In virtue of (20), we have the identity related to the continuous mesh model =Z due to (20) p The "transformation" of identity (26) to its discrete form by K (e(x)) q dx (due to (19)) gives (25).

Setting the Size of the Mesh Elements (Step (S1))
We employ the error equi-distribution (25) for the optimization of the size of mesh elements. Obviously, we have to set μ K , K ∈ T h such that the corresponding interpolation error estimates satisfy From (25) we deduce that e q K ( p K + 2) −1 ≈ C for some C > 0 and consequently e q K ≈ C( p K + 2), K ∈ T h . This relation together with (27) implies .
Hence, we define the local tolerances and the aim is to modify the element size μ K such that e K ≈ ω K ∀K ∈ T h . Therefore, we set the new size of mesh elements μ K by the relation where κ > 0 is a scaling function continuously depending on e K /ω K such that κ(e K /ω K ) > 1 for e K < ω K and κ(e K /ω K ) < 1 for e K > ω K . A suitable choice is shown in Fig. 1, where κ attains the minimum 0.25 for large values e K /ω K (= the maximal reduction of an element size in one level of mesh adaptation) and κ attains its maximum 2.5 for small values e K /ω K (= the maximal increase of an element size in one level of mesh adaptation). The function κ is almost constant for e K /ω K ≈ 1, which makes the algorithm more stable.

Setting of the Polynomial Approximation Degrees (Step (S2))
The polynomial approximation degrees are adapted locally for each K ∈ T h . From the set Q K := { p K − 1, p K , p K + 1}, we select the new degree p K which gives the smallest interpolation error bound for the same density of DoF (cf. (15)). In particular, for each ∈ Q K , we evaluate the anisotropic bound functions A w, , ρ w, and ϕ w, , cf. (5) (depending on the + 1-th derivatives of w). In order to fix the density of DoF, we set the element sizes Then we evaluate the corresponding right-hand side of (12) as and chose ∈ Q K minimizing this interpolation error bound, i.e.,

Mesh Adaptive Algorithm
Employing the setting of optimal shape (Lemma 2), optimal size (Sect. 2.6.1), and polynomial degree (Sect. 2.6.2) of mesh elements, we define Algorithm 1, which exhibits one iteration of a modification of the given hp-mesh T h p p p to produce a new (better) hp-mesh T h p p p . (30) (the error equi-distribution) 6: set new shape σ K and orientation φ K by (11) 7: set new polynomial degree p K by (32) -(33) 8: end for 9: Step 9 of Algorithm 1) is carried out by a definition of a metric field over Ω and performing a sequence of local operations in order to construct an uniform triangulation under this metric.

Remark 4
The aforementioned metric is usually represented by a set of ellipses centered at barycentres of triangles of the mesh to be optimized. Each ellipse is defined as the smallest circumscribed ellipse of the triangle with geometry {μ K , σ K , φ K }. For more detail, see, e.g., [18]. We use the in-house code Angener [13] to generate meshes from a prescribed metric. Problem 1 can be solved approximately by performing several loops of Algorithm 1, see Algorithm 2. We note that sometimes it is necessary to carry out a high number of loops in Algorithm 2 in order to fulfil condition (i) of Problem 1. The number of loops can be reduced by calling subroutine hpAMA with a smaller tolerance ω. Typically we choose one half or one quarter of the prescribed tolerance. Then condition (i) of Problem 1 is achieved earlier but the corresponding number of DoF is naturally larger.

Space-Time Discontinuous Galerkin Method
In this section, we formulate the approximate solution of problem (1) by the timediscontinuous Galerkin method. By (·, ·) 0 , we denote the L 2 (Ω)-scalar product.

Weak Solution
First, we formally introduce a weak formulation of (1). For simplicity, we assume that ϑ is continuously differentiable. Let W and V be suitable spaces of the trial and test functions, respectively, related to the operator L . We denote by w(t)(x) := w(x, t) a function on Ω for any t ∈ (0, T ). We say that function w is the weak solution of (1) if w(t) ∈ W for almost all t ∈ W and where a(·, ·) represents the weak form of the operator L . The right-hand side of (1) is included in a, which is nonlinear with respect its first argument, in general. We assume that there exists a unique weak solution of (34).

Space-Time Discretization
We approximate the solution of (34) by a space-time piecewise polynomial function defined on varying meshes for different time levels For each m = 0, . . . , r , we consider a conforming triangular mesh T h,m of Ω and hpmesh T h p p p,m = {T h,m , p p p}, where p p p = {p K ∈ I, K ∈ T h,m }, and p K denotes again the polynomial approximation degree assigned to K ∈ T h,m . Then, for each m = 0, . . . , r , we define the space where [P p K (K )] n denotes the space of vector-valued polynomial functions over K ∈ T h,m whose total degree is at most p K in each component of ϕ ϕ ϕ h = (ϕ 1 , . . . , ϕ n ) T . We recall that n ≥ 1 denotes the number of equations in (1). Whereas we consider a varying polynomial approximation degree p K with respect to space, the polynomial approximation degree q ≥ 0 is kept fixed. Hence, for any space-time element K × I m , K ∈ T h,m , m = 1, . . . , r , we introduce the spaces of space-time polynomial functions Further, we define the space of piecewise polynomial functions over the space-time layer Ω × I m by which consists of polynomials of degree q ≥ 0 with respect to time. Later, we will employ the functional spaces S S S Obviously, functions from S S S τ,q h,p p p are discontinuous with respect to space as well as time coordinates.
Let w hτ ∈ S S S τ,q h,p p p . For each m = 0, . . . , r − 1, we define the traces of w hτ at t m and the jumps of w hτ on t m by where the form a h represents the discretization of the form a from (34). For more detail see, e.g., [16,. Then the space-time discontinuous approximate solution reads: Relations (41) provided that the exact weak solution w is sufficiently regular.

Recomputation Between Meshes on Two Consecutive Time Levels
As mentioned above, the approximate solution is discontinuous with respect to time. The solution on two successive time intervals I m−1 and I m is joined together in the weak sense by the last term on the left-hand side of (40), i.e., Obviously, w hτ | + m−1 , v| + m−1 ∈ S S S h,p p p,m but w hτ | − m−1 ∈ S S S h,p p p,m−1 . Therefore, the first term on the right-hand side of (43) is easy to evaluate by a numerical quadrature, but the evaluation of the second term is rather delicate since its arguments w hτ | − m−1 and v| + m−1 are given elementwise on different meshes T h,m−1 and T h,m , respectively. See Fig. 2, left, for an illustration. If the test function v| + m−1 ∈ S S S h,p p p,m has support on one element K ∈ T h,m (the blue triangle outlined in bold in Fig. 2, left), then it intersects several triangles K ∈ T h,m−1 .
One possibility how to evaluate the last integral in (43) is to determine all intersections of any K ∈ T h,m with K ∈ T h,m−1 , split the arising polygons onto triangles, and perform the integration over them. Although this approach is mathematically rigorous, its numerical implementation is rather complicated. Moreover, if the intersection of K ∩ K for some K ∈ T h,m−1 is very small and it has an obtuse shape then the numerical evaluation is ill-conditioned, see Fig. 2, center.
A more robust (and less accurate) approach is the evaluation of the last integral in (43) by a composite quadrature rule. In particular, let v be a basis function of S S S h,p p p,m having a support on K ∈ T h,m . Then where D(K ) denotes a non-overlapping partition of K onto simplexes k (cf. Figure 2, right), and the integrals over k ∈ D(K ) are approximated by a quadrature with weights γ i and nodes x k,i , i = 1, . . . , N . In practice, K is split on 4, 9, or 16 self-similar sub-elements k and the Dunavant quadratures [22] are employed. In the experiments presented in this paper, the splitting on 9 sub-elements is employed, the use of more sub-elements has only a slight influence on the solution. Obviously, the quadrature nodes can belong to different K ∈ T h,m−1 , and the order of any quadrature is low since the restriction of ϑ(w hτ )| − m−1 is discontinuous on K ∈ T h,m . However, we demonstrate in Sect. 5 that this approach is sufficiently accurate, namely that there is not an essential difference between the matching and non-matching grids.

Anisotropic hp-mesh Adaptation for Time-Dependent PDEs
In this section, we extend the mesh adaptive algorithm from Sect. 2 to the numerical solution of time-dependent problems. In Sect. 4.1, we introduce an analogue of Problem 1 and in Sect. 4.2, we describe the whole space-time adaptive procedure including several implementation issues.

Problem Formulation
Let w(t) ∈ W , t ∈ (0, T ) be the weak solution of (34) and {I m } r m=1 be the partition of (0, T ) introduced in Section 3.2. Let Π m w(t) ∈ S S S h,p p p,m be the projections of w(t), t ∈ I m , m = 1, . . . , r , cf. (35). In virtue of (42), we introduce the following problem:

Problem 3
Let ω > 0 be a given tolerance and {I m } r m=1 be the given partition of (0, T ). We seek the finite sequence of spaces S S S h,p p p,m , m = 1, . . . , r (i.e., meshes T h,m , m = 1, . . . , r with the polynomial approximation degrees p K , K ∈ T h,m ) such that Problem 3 means that we control the interpolation error w − Π m w for any time t ∈ (0, T ). However, for practical reasons, it makes sense to consider the interpolation error only for a finite set of time levels, typically at the integration nodes used for the evaluation of the time integral over I m in (41a) and at the endpoints of each I m . We denote such sets by J m , m = 1, . . . , r . Therefore, we replace Problem 3 by

Problem 4
Let ω > 0 be a given tolerance and {I m } r m=1 be the given partition of (0, T ). We seek the spaces S S S h,p p p,m , m = 1, . . . , r such that Similar to Problem 1, the space-time problem Problem 4 has a (possibly non-unique) solution. However, the practical solution of Problem 4 is rather difficult and we solve it approximately by the adopting the technique introduced in Sect. 2, which is described in the following section.

Higher-Order Reconstruction
Condition 4 in Problem 4 contains the exact solution w(t), t ∈ J m , which is not available in practice. Therefore, we replace w in this condition by a higher-order reconstruction w hτ , computable from the approximate solution w hτ .
In [14,19,20], we developed a higher-order reconstruction technique based on the least squares approximation. Here, we present its space-time variant. Let I m , m = 1, . . . , r be an interval and w hτ | Ω×I m ∈ S S S τ,q h,p p p,m be the space-time piecewise polynomial function satisfying (41a) for all v ∈ S S S τ,q h,p p p,m . Let N (K ), K ∈ T h,m be a set of K ∈ T h,m sharing at least a vertex with K . We define the function w K ,m ∈ P q+1 p K +1 (N (K )× I m ) (cf. (36)) such that where · L 2 (I m ,H 1 (K )) is the Bochner norm over the space-time element K ×I m , K ∈ N (K ) and δ K ≥ 0 are the weights. We set δ K = 1 for K sharing a face with K , and δ K = for K ∈ N (K ) sharing only a vertex with K . A typical value is = 0.25. The reconstruction (45) is local, and w K ,m approximates w hτ in the weighted least-square sense. Having the local reconstructions w K ,m , K ∈ T h,m , we set the global one w hτ ∈ S S S τ,q+1 h,p p p+1 1 1,m by gluing them together, i.e.,

Adaptive Choice of the Time Step
In Problem 4 we considered an a priori given partition of the time interval (0, T ). In practical computations, the choice of the size of time steps τ m := t m −t m−1 , m = 1, . . . , has to be done adaptively based on the available approximate solution. It is necessary to balance the accuracy (too large time steps cause increase of the computational error) and the efficiency (too small time steps increase the number of time steps and can prolongate the computational time). A reasonable strategy is to choose the time step in such a way that the errors arising from the spatial and temporal discretizations are comparable. We employ the approach proposed in [21], where the space and time errors are estimated by residual-based estimators η m S and η m T , respectively. The numerical examples in [21] show that, whereas η m S is (almost) independent on τ m , m = 1, . . . , r , the time estimator fulfills η m where 0 < c T ≤ 1 is the chosen safety factor. Typically we put c T = 0.1. The aforementioned estimators are defined for each m = 1, . . . , r , using (37), by where the norm is chosen element-wise as Due to the choice (49), both estimators in (48) are computable element-wise and therefore their setting is cheap. In order to fulfill (47), we have developed in [21] the adaptive choice of the time step by the relation η m T = O(τ q+1 m ), cf. (42). However, this technique is not sufficiently efficient, e. g., for porous media flow problems, probably by the lack of regularity of the weak solution.
Therefore, we present here a more robust and efficient technique. We set the parameter ξ m := log(η m T /(c T η m S )) for m = 1, . . . , r . Obviously, the optimal size of the time step gives ξ m ≈ 0. Moreover, if ξ m > 0 the time step should be decreased, and if ξ m < 0 the time step should be increased. The new size of the time step is set according to the following empirical formula where and S, s max and s min are user-defined parameters. Obviously, β(ξ m ) is continuously differentiable with respect to ξ m . The values s max > 1 and s min > 0 describe the maximal relative increase and the maximal relative decrease of the time step, respectively, and S > 0 is the threshold. We use the values S = 1, s max = 1.25, and s min = 0.5. The idea of the adaptive choice of the time step during the computational procedure is the following: We perform the m-th step by the solution of (41a) and evaluate η m S and η m T . If condition η m T > c T η m S (cf. (47)), the current time step is refused and we repeat it with a new time step size τ new given by (50). Otherwise, we proceed to the next time step, again using τ new from (50). Alternatively, although not considered here, it would be possible to increase the time polynomial degree or repeat several last time steps.
Finally, let us note that based on numerical experience, we automatically reduce the size of the time step by a factor β = s min after each re-meshing. This typically helps us to avoid several time steps which are rejected. The whole adaptive algorithm is explained in the next section.

Adaptive Algorithm
The whole adaptive procedure for the solution of time-dependent PDEs is summarized in Algorithm 3. At each time step, we check if the error estimator η m is under the given tolerance (cf. step 14 of Algorithm 3). If this condition is satisfied, we use the hp-mesh from the previous time step(s). Otherwise, we perform a re-meshing. We employ the anisotropic hp-mesh adaptation technique from Sect. 2, namely we call the subroutine hpAMA (cf. The intersection of two metrics are defined through the geometrical intersection of two corresponding ellipses having the same barycenter as the ellipse which is a subset of both ellipses and having the maximal possible area. Moreover, concerning the choice of the polynomial approximation degree, we adopt technique (31)- (33) in such a way that in step (32), we take the maximal value from E ( ) Finally, we note that it is possible to reduce the number of re-meshing operations by using a smaller input tolerance in the calls of Algorithm 1, cf. steps 17-18 of Algorithm 3. However, the resulting hp-grids have naturally higher number of DoF, and then the reduction of computational cost is questionable.

Numerical Verification
In this section, we present several numerical experiments demonstrating the ability of the proposed anisotropic hp-mesh method to solve various problems of type (1). First, we consider a scalar nonlinear convection-diffusion equation with known analytical solution where the exact error and its estimate can be compared. Moreover, we consider several more complicated test examples, namely the isentropic vortex propagation (inviscid compressible flow), the Kelvin-Helmholtz instability (viscous compressible flow) and the simulation of a flow though a variably saturated porous media described by the Richards equation. In all cases, we employ the space-time discontinuous Galerkin method. For the description of the method together with the implementation detail, we refer to [16] and for the porous media problem to [17]. if η m T > c T η m S then 9: propose new size of the time step τ m using (50) 10: end if 11: until η m T ≤ c T η m if η m ≤ ω then 15: set t := t + τ m , propose τ m+1 using (50)

Moving Interior Layer
We solve the viscous Burgers equation written as where Ω = (−1, 1) × (−1, 1), T = 1 and ε > 0 is the diffusion coefficient. We prescribe the Dirichlet boundary condition on Γ := ∂Ω and the initial condition such that the exact solution reads This problem exhibits a propagation of an interior layer in the diagonal direction (1, 1). The width of the layer is proportional to ε. We consider the values ε = 10 −2 and ε = 10 −3 .
Since the exact solution is available, we are able to evaluate the exact error and compare it to the proposed error estimate. To produce reference data, we carried out the computation on fixed uniform triangular grids consisting of rectangular triangles with element size h = 1/6, h = 1/12, and h = 1/24. We employ the polynomial approximations p = 1, 2, 3 with respect to space and q = 1, 2 with respect to time. The time step was fixed at τ = 10 −2 .
The computations on the fixed grids are compared to the anisotropic hp-mesh adaptation, Algorithm 3, with several tolerances ω. The initial mesh was a coarse uniform (h = 1/6) and the initial polynomial degree was set to p K = 2, K ∈ T h . The corresponding results are shown in Tables 1-4, where -DoF is the average number of degrees of freedom per time step, -#τ m is the number of time steps needed to reach the final time T , e h := w − w hτ L ∞ (0,T ;L 2 (Ω)) is the error of the approximate solution w hτ , -E h := w hτ − w hτ L ∞ (0,T ;L 2 (Ω)) is the corresponding error estimate obtained by the higher-order reconstruction, -the quantities measure the jumps with respect to time (the "time-inconsistency") due to the time discontinuous approximation, i N is the total number of nonlinear iteration for all time levels, i L is the total number of linear (GMRES) iteration for all nonlinear iterations and all time levels, -"time" is the total computational time in seconds. It depends on the implementation so it has only an informative character.
Some outputs from these tables are also shown for q = 2 in Fig. 3, namely the convergence of the error e h and its estimate E h with respect to DoF and computational time.
As one might expect, we observe that higher polynomial approximation degree on fixed meshes leads to faster decay of the errors and higher efficiency in terms of DoF as well as computational time. The use of mesh adaptation significantly reduces the computational cost required to achieve the given error tolerance. Moreover, the error estimate using the higher order reconstruction does not provide an upper bound of the error, but the approximation is reasonable, as the rates of e h and E h are quite similar. Finally, comparing the values J aver and J max , we arrive at the conclusion that the use of non-nested meshes with the recomputation from Sect. 3.3 does not bring any essential increase of the inaccuracies. These values depend on the polynomial approximation degree with respect to time (for q = 2 they are much smaller than for q = 1), but there are minor differences between the computation with and without mesh adaptation.
Moreover, Fig. 4 demonstrates the performance of Algorithm 3. The left-hand-side figure shows the error estimate η m of all computed time steps (blue square boxes). The time steps having η m > ω are refused, while the accepted time steps are marked by black crosses. The right-hand side figure shows the comparison of the error w − w hτ L ∞ (I m ,L 2 (Ω)) and its estimate η m for all (accepted as well as refused) time steps m = 1, . . . , r . We observe a reasonable approximation of the error locally in time.
Finally, Figs. 5 and 6 show the achieved hp-meshes and the corresponding isolines of the solution at time instants t = 0.2, t = 0.6, and t = 1.0. We observe a strong anisotropic refinement along the interior layers, the refinement is stronger for the case with ε = 10 −3 . Outside of the layer, the mesh consists of large elements with the lowest polynomial approximation degree. The isolines show a perfect capturing without any oscillations.

Isentropic Vortex Propagation
The propagation of an isentropic vortex through the periodic domain is a classical benchmark proposed in [39]. This problem is described by the compressible Euler equations where the sought solution vector is w = (ρ, v, e) T ∈ R 4 : ρ is the density, v is the velocity vector, and e is the energy. This system is accompanied by the state equation defining the relation between the energy and pressure p, namely e = p/(γ − 1) + 1 2 ρ|v| 2 , where γ = 1.4 is the adiabatic Poisson constant. For details, we refer to, e.g., [25,47].
We consider the computational domain Ω := [0, 10] × [0, 10] which is extended periodically in both directions. The mean flow is defined byρ = 1,v = (1, 1) T andp = 1. An isentropic vortex is added to the mean flow, i.e., there is a perturbation in v and the temperature θ = p/ρ, but no perturbation in the entropy S = p/ρ γ , where γ is the Poisson constant. For analytical relations we refer to [39] or [16,Chapter 8].  The aforementioned problem setting leads to the passive convection of the vortex with the mean velocity, hence the analytical solution is available and the error can be evaluated. In particular, the flow is time-periodic with the periodt = 10, i.e., we have w(x, t) = w(x, t + 10) for all x ∈ Ω and t > 0. We carried out the computation till T = 30, hence 3 time-periods have been computed.
We applied Algorithm 3 with several tolerances. The corresponding results are shown in Table 5. The symbols are the same as in the previous example. Obviously, for decreasing tolerances, the error as well as its estimates are decreasing. Similar to the scalar example from Sect. 5.1, we observe a reasonable approximation of the error by the interpolation error in the L ∞ (0, T ; L 2 (Ω))-norm. The time inconsistency quantities J aver and J max are decreasing for decreasing tolerances. Figure 7 shows the error and its estimates for all time steps (including the refused time steps). We observe that although the error approximation during the first period (t ∈ (0, 10)) is reasonable, the error starts to increase during the second period, whereas the error estimate is under the tolerance ω. This is caused by the fact that the computational error is accumulated for the long time interval, whereas the higher-order reconstruction is local in time.  30), and we observe that the graphs of isolines are very similar. This effect is also demonstrated by Fig. 9 where the distribution of the density and the Mach number along the diagonal cut through the vortex at t = 10, t = 20, and t = 30 are shown. We observe almost identical results. Hence the use of non-nested  anisotropic hp-meshes in combination with space-time DGM does not cause any essential lost of accuracy.

Kelvin-Helmholtz Instability
The next example exhibits the simulation of the Kelvin-Helmholtz instability, which appears when a velocity difference across the interface between two fluids is presented. Then there appear the typical Kelvin-Helmholtz roll-ups, which are challenging to simulate numerically. The problem is described by the compressible Navier-Stokes equation having the same sought vector w = (ρ, v 1 , v 2 , e) as in Sect. 5.2. For details, we refer again to [25,47]. We consider exactly the same setting as in [42]. The computational domain Ω = (0, 1) 2 is extended periodically in both directions and the final time is T = 2. The initial conditions are given by where we set ε = 0.01 sin(4π x 1 ) to trip the instability. We consider the fluid viscosity μ = 2 · 10 −4 , the heat capacity at constant pressure c p = 1005, the adiabatic Poisson constant γ = 1.4, and the Prandtl number Pr = 0.72. Figure 10 shows the hp-meshes and the density distribution obtained by Algorithm 3 at several time instants. We observe the development of the roll-ups and the expectable hpmesh adaptation. High polynomial degrees are generated within the roll-ups, the shapes of elements follow their directions. The lowest polynomial degrees ( p = 1) are outside the interfaces where the solution is almost constant.

Simulation of the Single Ring Infiltration
The last example exhibits a simulation of the flow through a variably saturated medium described by the Richards equation, written in the form where ψ and Ψ denote the pressure and hydraulic heads, respectively, related by Ψ = ψ +x 2 , and x 2 is the vertical coordinate. Moreover, K(ψ) is the unsaturated hydraulic conductivity given by K(ψ) = K r (ψ)K s , where K r (ψ) is the relative hydraulic conductivity, and K s is the saturated hydraulic conductivity tensor. Furthermore, ϑ(ψ) denotes the active pore volume given by ϑ(ψ) is the water content function, θ S is the limited saturated water content, and S s is the specific storativity. The function θ(ψ) is given by the van Genuchten's law [43], and the relative conductivity K r (ψ) is given by the Mualem function [33]. For a more detailed description we refer to, e.g., [17,41]. Both functions, ϑ and K, depend nonlinearly on their arguments, and they are not continuously-differentiable at ψ = 0, which causes difficulties in the convergence of the solvers.
We consider the simulation of the single-ring infiltration. The computational domain is shown in Fig. 11. At the time t = 0, a dry medium with Ψ = −2 is prescribed. On the boundary part Γ D we set the Dirichlet boundary condition Ψ = 1.05, and on ∂Ω \ Γ D we consider the homogeneous Neumann boundary condition. We note that the smaller "magenta" vertical lines starting at Γ D also belong to the boundary and they are impermeable. The inconsistency of the initial and boundary condition on Γ D makes the computation rather difficult for t ≈ 0.
We carried out the computation until the physical time T = 2 hours, and we are interested in the water flux through the boundary Γ D . Hence, we consider two quantities, the actual flux and the total (accumulated) flux given by and respectively. The results obtained by Algorithm 3 for several tolerances are shown in Table 6. Besides the quantities measuring the jumps with respect to the time, the number of algebraic iterations, and computational time, we present the accumulated flux F(T ) and also the quantity δ(T ) indicating the conservativity of the numerical method. Namely, δ(t) is the relative difference between the increase of the water contents Ω (ϑ(x, t) − ϑ(x, 0)) dx and the total flux − t 0 ∂Ω K(ψ)∇Ψ (x, t ) · n dSdt (≈ F(t) due to the prescribed boundary conditions). Table 6 shows that δ(T ) is about 1% which is an acceptable inaccuracy. Furthermore, Fig. 12 presents the dependence of the actual flux F(t) on t ∈ (0, T ) for the treated tolerance. The left figure shows the global view and the other ones the details close t = 0 and t = T . We observe that the computations with lower tolerances do not capture the behavior well. On the other hand, this inaccuracy affects the total flux only slightly.  Moreover, Fig. 13 shows the hp-meshes and the distribution of the hydraulic head at selected time levels obtained from Algorithm 3 with ω =5E-03. Finally, we note that computational times observed for this example (last column in Table 6) are very large, which is caused by the high number of linear and nonlinear iterations. This is caused in turn by the non-regularity of the constitutive relations for ϑ and K in (55). Hence, the use of an efficient adaptation method, which significantly reduces the number of DoF, is an essential tool which can accelerate the computational process.

Summary
We proposed an adaptive space-time discontinuous Galerkin method for the numerical solution of time-dependent PDEs based on the control of the interpolation error, which is estimated by the space-time variant of a higher-order reconstruction. If the interpolation error estimate is over the prescribed tolerance in the particular time step, the computational grid is completely re-meshed including the shape of elements and the polynomial approximation degrees, and the time step is repeated. Thus the anisotropic methodology reliably aligns the grid with dominant solution features leading to significant error reduction. Although the grids employed at the different time steps are non-nested and non-matching, a simple recomputation technique among them does not lead to any essential decrease of accuracy. This effect was demonstrated by the presented numerical examples.
One downside of the current approach is that the interpolation error estimates employed do not provide an upper bound of the error. One possibility is to use the approach developed, e.g., in [23,24]. However, from a practical point of view, a better option seems to be the development of goal-oriented error estimates. We dealt with this technique in [15,20] for time-independent problems. Nevertheless, the goal-oriented anisotropic hp-mesh adaptation method for time dependent problems is currently completely open.
Finally, the presented approach is based on the minimization of the number of degrees of freedom. However, the reduction of DoF does not imply the reduction of the computational cost at the same rate. The presence of anisotropic elements, as well as the varying polynomial approximation degrees, typically increase the difficulty in the solution of the arising algebraic systems. The effect requires a deeper numerical analysis which will be the subject of further research.
Funding Open access publishing supported by the National Technical Library in Prague. The authors have not disclosed any funding.
Data Availability Enquiries about data availability should be directed to the authors.

Conflict of interest The authors have not disclosed any competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
A direct calculation (cf. proof of Lemma 3) yields a system of nonlinear exponential-algebraic relations However, it is not clear if functions μ(x) and p(x) can be eliminated from (59).