1 Introduction and Historical Background

The purpose of this paper is to show that variational numerical problems have their own conservation laws which are derived from the same principle as that discovered by Noether, giving rise to discrete (numerical) forms of conservation laws which are automatically preserved by the scheme.

Symmetries are an extremely important and continually occurring feature of differential equations arising from many applicable areas, including mathematical physics, meteorology, and differential geometry, that was first developed by Sophus Lie for the purpose of studying solutions of differential equations in the late nineteenth century ([1, 2, 3, 4, 5, c.f.] and references therein).

Noether’s (first) Theorem [1, 5, 6] is a striking result which in the continuous setting connects these symmetries with conservation laws associated with the Euler–Lagrange equations of a variational problem. Roughly, the theorem states that given a variational problem with an underlying symmetry, there exists a natural conservation law associated with it. For example, a symmetry of translation with respect to the spatial coordinates results in a conservation of linear momentum, a symmetry of rotation results in conservation of angular momentum, and a symmetry of translation with respect to the temporal coordinate gives a conservation of energy. A famous example from meteorology is that of potential vorticity. This is a conservation law arising from a particle relabelling pseudo-group symmetry. This quantity is extremely important in studying the evolution of a cyclone [7, c.f.].

The work of Noether has gained public attention recently with the publication of an article in the New York Times [8] where the result is

“consider[ed] ...as important as Einstein’s theory of relativity; it underg[ir]ds much of today’s vanguard research in physics, including the hunt for the almighty Higgs boson.”

In the discrete setting, Noether’s Theorem has been studied in terms of difference equations [9, 10], where it was shown that a discrete equivalent of the conservation law holds when a smooth symmetry was built into the discrete Lagrangian. In this work, we turn our attention to the finite element method (FEM). FEMs form one of the most successful numerical methods for approximating the solution to partial differential equations (PDEs) [11, 12, 13, c.f.]. A topic which has been the subject of much ongoing research is that of constructing FEMs which inherit some property of the continuous problem. The notion of discretisations inheriting some geometric property from the continuous problem can be seen as a generalisation of geometric integration [14, c.f.] to the case of PDEs and is a rapidly developing area of research. Some of the properties studied in the discretisation of PDEs are the same as in the geometric integration of the ODE, for example the Hamiltonian structure of a given problem. Others are based on completely new notions, for example the recent development of the discrete exterior calculus [15, 16], which, as the name suggests, is a discrete equivalent to the Cartan-based exterior calculus. This has allowed for a rigorous description of discrete differential forms and the associated discrete function spaces as a discrete differential complex. This provides a framework which may be used as a first step in the construction of a variational complex in a similar light to that developed in [10] for difference equations. A first step in this direction was taken in [17]. A review of some of the huge quantity of topics arising from this area, including Lie group integrators, discrete gradient methods as well as FEMs for differential forms, is given in [18].

As opposed to geometric integrators, numerical methods with some geometric property of an ODE, the methods for PDEs are generally called mimetic methods. A class of FEMs which fall under the mimetic framework are the mixed methods, for example the Raviart–Thomas scheme [19]. Further, there are finite difference (FD) [20] and finite volume (FV) schemes which are characterised as mimetic. Relationships can exist between these; for example, an appropriate choice of quadrature for the Raviart–Thomas finite element scheme results in the mimetic FD scheme [21].

The Lagrangian piecewise polynomial FEM is not a mimetic method. Indeed, most standard methods cannot inherit geometric properties of the continuous PDE: There is an underlying algebraic condition which must be satisfied for differential geometric properties to be inherited by the approximation scheme [15, 17].

The classical Noether Theorem is only applicable to classical solutions of the variational problem. As such, we derive weaker versions of the theorem applicable to a wider class of solutions to the problem, including the broken extremals. We will discuss how these laws are naturally passed down to the Lagrangian finite element scheme and hence quantify a discrete Noether quantity associated with this FEM. That is, we write the exact Noether quantity for this discretisation, in the same spirit as [10]. We will also study how well the Lagrangian finite element scheme satisfies the strong conserved quantities arising from Noether’s Theorem measured in an appropriate weak norm. That is, we consider how well this finite element scheme approximates the Noether conservation law for the continuous problem (when one exists). We will also present some numerical results, quantifying the deviation of the approximation in terms of a computable estimator which we are able to use to construct an h-adaptive scheme (based on local mesh refinement) aimed at minimising the violation of the smooth conservation law to a user-specified tolerance.

We note related work by Christiansen and Halvorsen [22], which gives a Noether Theorem for a class of problems which includes a simplicial discretisation of the gauge invariant Yang Mills Lagrangian. The group action is assumed to preserve fibres and the conservation law yields the desired associated constraint preservation. Although the authors talk about “discrete gauge invariance”, it is important to realise that it is not the group action which has been discretised, but rather that the action is induced on a discretised base space and function space. The induced action must still be smooth in order to obtain a Noether conservation law.

The paper is set out as follows: In Sect. 2 we introduce some fundamental notation and the model problem we consider. In Sect. 3 we briefly describe Noether’s Theorem and the background material needed. To illustrate its application we apply the theorem to a simple model problem. In Sect. 4 we weaken the invariance criterion on which the classical Noether Theorem is based, ultimately allowing us to prove two versions of the theorem applicable to weaker solutions of the problem. In Sect. 5 we discuss how the results of Sect. 4 can be passed down to give discrete counterparts to our weak Noether’s Theorem. We perform numerical experiments to demonstrate that the quantities derived are indeed conserved at the discrete level. We also discuss trivial Lie group actions (those of translation with respect to the dependent variable) and how the mimetic methods relate to this case. Finally, in Sect. 6 we study the properties of the finite element solution with respect to the original (strong) Noether Theorem. We also detail the construction of a computable estimator, aimed at measuring the violation of the strong Noether Theorem in a weak norm for the Lagrangian finite element scheme. We perform some numerical experiments of the estimator over the finite element approximation of the solution to the Euler–Lagrange equations. We then proceed to test an adaptive scheme based on the estimate allowing us to minimise the violation of the continuous conserved quantity up to user-specified tolerance.

Finally, we thank the anonymous referees for their thorough reading and constructive criticism.

2 Notation

Let \(\Omega \subset \mathbb {R} ^d\) be a bounded domain with boundary \(\partial \Omega \). We begin by introducing the Sobolev spaces [13, 23]

$$\begin{aligned} {\text {L}} _{p}(\Omega )= & {} \left\{ \phi :\;\int _\Omega \left| \phi \right| ^p {\,\mathrm {d}\varvec{x}} < \infty \right\} \text { for } p\in [1,\infty ) \text { and }\nonumber \\ {\text {L}} _{\infty }(\Omega )= & {} \left\{ \phi :\;{\text {ess sup}}_\Omega \left| \phi \right| < \infty \right\} , \end{aligned}$$
(2.1)
$$\begin{aligned} {\text {W}} ^{k}_{p}(\Omega )= & {} \left\{ \phi \in {\text {L}} _{p}(\Omega ):\;\partial ^{\varvec{\alpha }}\phi \in {\text {L}} _{p}(\Omega ), \text { for } \left| \varvec{\alpha }\right| \le k \right\} \text { and } {\text {H}} ^{k}(\Omega ) := {\text {W}} ^{k}_{2}(\Omega ),\nonumber \\ \end{aligned}$$
(2.2)

which are equipped with the following norms and semi-norms:

$$\begin{aligned}&\left\| v\right\| _{{\text {L}} _{p}(\Omega )}^p := {\int _\Omega \left| v\right| ^p {\,\mathrm {d}\varvec{x}}}, \qquad \left\| v\right\| _{{\text {W}} ^{k}_{p}(\Omega )}^p = \sum _{\left| \varvec{\alpha }\right| \le k}\left\| \partial ^{\varvec{\alpha }} v\right\| _{{\text {L}} _{p}(\Omega )}^p, \end{aligned}$$
(2.3)
$$\begin{aligned}&\left| v\right| _{{\text {W}} ^{k}_{p}(\Omega )}^p = \sum _{\left| \varvec{\alpha }\right| = k}\left\| \partial ^{\varvec{\alpha }} v\right\| _{{\text {L}} _{p}(\Omega )}^p, \qquad \left\| v\right\| _{{\text {H}} ^{k}(\Omega )}^2 = \left\| v\right\| _{{\text {W}} ^{k}_{2}(\Omega )}^2, \end{aligned}$$
(2.4)

where \(\varvec{\alpha }= \{ \alpha _1,\ldots ,\alpha _d\}\) is a multi-index, \(\left| \varvec{\alpha }\right| = \sum _{i=1}^d\alpha _i\) and derivatives \(\partial ^{\varvec{\alpha }}\) are understood in a weak sense. We pay particular attention to the cases \(k = 1,2\) and

$$\begin{aligned} {\overset{{\scriptscriptstyle \circ }}{{\text {W}}}}{}^{1}_{p}(\Omega ) := \text {closure of }{\text {C}} ^{\infty }_0(\Omega ) \text { in } {\text {W}} ^{1}_{p}(\Omega ). \end{aligned}$$
(2.5)

Let \(L = L\left( {\varvec{x}, u, \nabla u}\right) \) be the Lagrangian. We will let

$$\begin{aligned} \begin{array}{rccl} {\mathscr {J} [\cdot ]}: &{} {{\overset{{\scriptscriptstyle \circ }}{{\text {W}}}}{}^{1}_{p}(\Omega )} &{} \rightarrow &{} {\mathbb {R}} \\ &{} {\phi } &{}\mapsto &{} {\mathscr {J} [\phi ] := \displaystyle \int _\Omega L(\varvec{x}, \phi , \nabla \phi ) {\,\mathrm {d}\varvec{x}}.} \end{array}\quad \end{aligned}$$
(2.6)

be known as the action functional. The problem arising from the calculus of variations is to seek a function extremising the action functional. For simplicity we will consider the minimisation problem, that is, to find \(u\in {\overset{{\scriptscriptstyle \circ }}{{\text {W}}}}{}^{1}_{p}(\Omega )\) such that

$$\begin{aligned} \mathscr {J} [u] = \inf _{v\in {\overset{{\scriptscriptstyle \circ }}{{\text {W}}}}{}^{1}_{p}(\Omega )} \mathscr {J} [v]. \end{aligned}$$
(2.7)

Note that we are implicitly coupling the minimisation problem with homogeneous Dirichlet boundary conditions.

We will use the notation that

$$\begin{aligned} \partial _1 q := \nabla q = {\left( {\frac{\partial q\left( {\varvec{x}, u, \nabla u}\right) }{\partial x_1}, \ldots , \frac{\partial q\left( {\varvec{x}, u, \nabla u}\right) }{\partial x_d} }\right) }^{{\varvec{\intercal }}} \end{aligned}$$
(2.8)

denotes a column vector of spatial derivatives of a generic scalar valued function q, i.e., derivatives with respect to the independent variables. The derivative with respect to the dependent variable is denoted

$$\begin{aligned} \partial _2 q := \frac{\partial q\left( {\varvec{x}, u, \nabla u}\right) }{\partial u}. \end{aligned}$$
(2.9)

Let \(\varvec{p} = {\left( {p_1,\ldots ,p_d}\right) }^{{\varvec{\intercal }}} = \nabla u\) then

$$\begin{aligned} \partial _3 q := {\left( {\frac{\partial q\left( {\varvec{x}, u, \nabla u}\right) }{\partial p_1}, \ldots , \frac{\partial q\left( {\varvec{x}, u, \nabla u}\right) }{\partial p_d}}\right) }^{{\varvec{\intercal }}} \end{aligned}$$
(2.10)

denotes the vector of derivatives of q with respect to the gradient of u componentwise. We use \({\text {div}}\) to represent the spatial divergence of a vector-valued function, \(\varvec{q} = \left( {q_1, \ldots , q_d}\right) \), hence

$$\begin{aligned} \partial _1 \varvec{q} := {\text {div}}\left( {\varvec{q}}\right) = \frac{\partial q_1\left( {\varvec{x}, u, \nabla u}\right) }{\partial x_1} + \cdots + \frac{\partial q_d\left( {\varvec{x}, u, \nabla u}\right) }{\partial x_d}. \end{aligned}$$
(2.11)

The derivative with respect to the independent variable is then a column vector

$$\begin{aligned} \partial _2 \varvec{q} := {\left( { \frac{\partial q_1\left( {\varvec{x}, u, \nabla u}\right) }{\partial u}, \ldots , \frac{\partial q_d\left( {\varvec{x}, u, \nabla u}\right) }{\partial u} }\right) }^{{\varvec{\intercal }}} \end{aligned}$$
(2.12)

and

$$\begin{aligned} \partial _3 \varvec{q} := \begin{bmatrix} \frac{\partial q_1\left( {\varvec{x}, u, \nabla u}\right) }{\partial p_1} ,\ldots , \frac{\partial q_d\left( {\varvec{x}, u, \nabla u}\right) }{\partial p_1}\\ \vdots \qquad \ddots \qquad \vdots \\ \frac{\partial q_1\left( {\varvec{x}, u, \nabla u}\right) }{\partial p_d} ,\ldots , \frac{\partial q_d\left( {\varvec{x}, u, \nabla u}\right) }{\partial p_d} \end{bmatrix}. \end{aligned}$$
(2.13)

With the above notations we may introduce the total derivative operator, defined for scalar valued functions as

$$\begin{aligned} \mathrm {D}q\left( {\varvec{x}, u, \nabla u}\right) := \partial _1 q + \nabla u \partial _2 q + \mathrm {D}^2u \partial _3 q, \end{aligned}$$
(2.14)

and total divergence operator, for vector-valued functions as

$$\begin{aligned} {\text {Div}}\left( {\varvec{q}\left( {\varvec{x}, u,\nabla u}\right) }\right) := \partial _1 \varvec{q} + {\left( {\partial _2 \varvec{q}}\right) }^{{\varvec{\intercal }}} \nabla u + {\partial _3 \varvec{q}}{:}{\mathrm {D}^2u}, \end{aligned}$$
(2.15)

where \({\varvec{X}}{:}{\varvec{Y}} = {\text {trace}}{{\varvec{X}}^{{\varvec{\intercal }}}{\varvec{Y}}}\) denotes the Frobenious inner product between matrices.

It is well known [23, 24, c.f.] that if u is a (smooth) minimiser of the variational problem (2.7) then it solves the quasilinear, second order PDE called the Euler–Lagrange equations

$$\begin{aligned} \left( {\mathscr {E} \mathscr {L}}\right) [u] := - {\text {Div}}\left( {\partial _3 L}\right) + \partial _2 L = 0. \end{aligned}$$
(2.16)

3 Noether’s First Theorem

For the readers benefit we will briefly describe Noether’s first Theorem in the classical case and necessary background material. We assume, in this section, that L is smooth and the minimisation problem (2.7) has a solution (not necessarily unique) which is at least \({\text {C}} ^{2}(\Omega )\), i.e., smooth enough to satisfy the Euler–Lagrange equations (2.16).

Definition 3.1

(one-parameter group) The transformation

$$\begin{aligned} \left( {\varvec{x}, u}\right) \rightarrow \left( {\Xi (\varvec{x}, u;~\epsilon ), \Phi (\varvec{x}, u;~\epsilon )}\right) =: \left( {\widetilde{\varvec{x}}, \widetilde{u}}\right) \end{aligned}$$
(3.1)

is said to be a one-parameter group if the following conditions hold

  1. (1)

    The parameter choice of \(\epsilon = 0\) yields the identity, i.e.,

    $$\begin{aligned} \left( {\varvec{x},u}\right) = \left( {\Xi (\varvec{x}, u;~0), \Phi (\varvec{x}, u;~0)}\right) . \end{aligned}$$
  2. (2)

    The inverse is given by the parameter \(-\epsilon \), i.e.,

    $$\begin{aligned} \left( {\varvec{x},u}\right) = \left( {\Xi (\widetilde{\varvec{x}}, \widetilde{u};~-\epsilon ), \Phi (\widetilde{\varvec{x}}, \widetilde{u};~-\epsilon )}\right) . \end{aligned}$$
  3. (3)

    The transformation is closed under composition, specifically, if

    $$\begin{aligned} \left( {\widehat{\varvec{x}}, \widehat{u}}\right) = \left( {\Xi (\widetilde{\varvec{x}}, \widetilde{u};~\delta ), \Phi (\widetilde{\varvec{x}}, \widetilde{u};~\delta )}\right) \end{aligned}$$

    then

    $$\begin{aligned} \left( {\widehat{\varvec{x}}, \widehat{u}}\right) = \left( {\Xi ({\varvec{x}}, u;~\epsilon +\delta ), \Phi ({\varvec{x}}, u;~\epsilon +\delta )}\right) . \end{aligned}$$

Definition 3.2

(infinitesimal) The infinitesimals, \(\varvec{\xi }(\varvec{x},u)\) and \(\phi (\varvec{x},u)\) of the one-parameter group are defined as

$$\begin{aligned} \varvec{\xi }(\varvec{x},u) := \lim _{\epsilon \rightarrow 0} \frac{\,\mathrm {d}\Xi (\varvec{x}, u;~\epsilon )}{\,\mathrm {d}\epsilon } \end{aligned}$$
(3.2)
$$\begin{aligned} \phi (\varvec{x},u) := \lim _{\epsilon \rightarrow 0}\frac{\,\mathrm {d}\Phi (\varvec{x}, u;~\epsilon )}{\,\mathrm {d}\epsilon }. \end{aligned}$$
(3.3)

Definition 3.3

(characteristics) We define the characteristics, which are given in terms of the infinitesimals of the group, to be

$$\begin{aligned} Q\left( {\varvec{x}, u, \nabla u}\right) := \phi (\varvec{x}, u) - {\left( {\varvec{\xi }(\varvec{x}, u)}\right) }^{{\varvec{\intercal }}}\nabla u. \end{aligned}$$
(3.4)

Definition 3.4

(variational symmetry) Let \(\Gamma := \left\{ {\left( {\varvec{x}, u(\varvec{x})}\right) : \varvec{x} \in \Upsilon }\right\} \) be the graph of u over a subdomain such that \(\overline{\Upsilon }\subset \Omega \). Also let \(\Upsilon _{\Xi } = \Xi \left( {\Gamma ; \epsilon }\right) \), then the transformation (3.1) is said to be a variational symmetry if

$$\begin{aligned} \int _{\Upsilon } L\left( {\varvec{x}, u, \nabla u}\right) \,\mathrm {d}\varvec{x} = \int _{\Upsilon _\Xi } L\left( {\widetilde{\varvec{x}}, \widetilde{u}, \widetilde{\nabla u}}\right) \,\mathrm {d}\widetilde{\varvec{x}} \end{aligned}$$
(3.5)

holds for any smooth subdomain \(\Upsilon \) of \(\Omega \).

Theorem 3.5

(infinitesimal invariance [5, Thm4.12]) A variational symmetry group with infinitesimals \(\varvec{\xi }, \phi \) and characteristics Q of the action functional

$$\begin{aligned} \mathscr {J} [u] = \int _\Omega L(\varvec{x}, u, \nabla u) \,\mathrm {d}\varvec{x} \end{aligned}$$
(3.6)

satisfies

$$\begin{aligned} 0 = {\left( {\mathrm {D}Q}\right) }^{{\varvec{\intercal }}}\partial _3 L+ Q\partial _2 L+ {\text {Div}}\left( {L \varvec{\xi }}\right) . \end{aligned}$$
(3.7)

Proof

See [5, Thm4.12]. \(\square \)

Theorem 3.6

(Noether’s first Theorem [5, Thm4.29]) Suppose the variational problem (2.7) is invariant under the action of a one-parameter group of symmetries with characteristics Q. Then, Q is also a characteristic of a conservation law of the Euler–Lagrange equation (2.16), that is, there exists a vector-valued functional \(\mathscr {C} = \mathscr {C} [u]\) such that

$$\begin{aligned} {\text {Div}}\left( {\mathscr {C} [u]}\right) = Q \left( {\mathscr {E} \mathscr {L}}\right) [u]. \end{aligned}$$
(3.8)

Hence, over solutions of the Euler–Lagrange equations \(\left( {\mathscr {E} \mathscr {L}}\right) [u] = 0\), we have that

$$\begin{aligned} {\text {Div}}\left( {\mathscr {C} [u]}\right) = 0. \end{aligned}$$
(3.9)

For the problem we consider in this work, that of a first-order Lagrangian, the conservation law, \(\mathscr {C} \), takes the form

$$\begin{aligned} \mathscr {C} [u] = - \left( { L \varvec{\xi } + \left( {\phi - {\varvec{\xi }}^{{\varvec{\intercal }}}\nabla u}\right) {\partial _3 L}}\right) . \end{aligned}$$
(3.10)

Proof

Using the result of Theorem 3.5, we have that

$$\begin{aligned} 0 = {\left( {\mathrm {D}Q}\right) }^{{\varvec{\intercal }}}\partial _3 L+Q\partial _2 L + {\text {Div}}\left( {L\varvec{\xi }}\right) . \end{aligned}$$
(3.11)

Noting by the product rule that

$$\begin{aligned} {\left( {\mathrm {D}Q}\right) }^{{\varvec{\intercal }}} \partial _3 L = -Q {\text {Div}}{\partial _3 L} + {\text {Div}}\left( {Q \partial _3 L}\right) , \end{aligned}$$
(3.12)

then it holds that

$$\begin{aligned} 0&= - Q {\text {Div}}\left( {\partial _3 L}\right) + {\text {Div}}\left( {Q \partial _3 L}\right) + Q \partial _2 L+{\text {Div}}\left( {L \varvec{\xi }}\right) \nonumber \\&=Q \left( {-{\text {Div}}\left( {\partial _3 L}\right) + \partial _2 L}\right) + {\text {Div}}\left( {Q\partial _3 L + L \varvec{\xi }}\right) . \end{aligned}$$
(3.13)

This concludes the proof with

$$\begin{aligned} \mathscr {C} [u] = -\left( {Q \partial _3 L + L \varvec{\xi }}\right) , \end{aligned}$$
(3.14)

as required. \(\square \)

Remark 3.7

(the form of \(\mathscr {C} \) ) It is clear from the identity (3.8) that for our model problem, that of minimising a first-order variational problem (2.7), we have \(\mathscr {C} = \mathscr {C} (u, \nabla u)\).

Remark 3.8

(the beauty of the theorem) What makes Theorem 3.6 truly remarkable is its constructive nature. For completeness, we will give an example of the construction of \(\mathscr {C} \) for the Laplacian.

Example 3.9

(Laplace’s problem) Let us consider the case \(f = f(\left| \varvec{x}\right| )\); then, the Lagrangian,

$$\begin{aligned} L\left( {\varvec{x}, u, \nabla u}\right) := \frac{1}{2} \left| \nabla u\right| ^2 - f u, \end{aligned}$$
(3.15)

is invariant under the rotational group SO(d). For simplicity, we restrict to the case \(d=2\), set \(\varvec{x} = {\left( {x, y}\right) }^{{\varvec{\intercal }}}\); then, we calculate the infinitesimals from the group of rotations; note that, in this case, \(\Phi \equiv 0\) and

$$\begin{aligned} \varvec{\Xi }(\varvec{x}, u;~\epsilon ) = \begin{bmatrix} x\cos \epsilon - y\sin \epsilon \\ x\sin \epsilon + y\cos \epsilon \end{bmatrix}. \end{aligned}$$
(3.16)

It then holds that

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\frac{\,\mathrm {d}\varvec{\Xi }(\varvec{x};\epsilon )}{\,\mathrm {d}\epsilon } = \begin{bmatrix} -y \\ x \end{bmatrix}. \end{aligned}$$
(3.17)

In this case, the characteristic of the group of rotations is

$$\begin{aligned} Q(\varvec{x}, u, \nabla u) = y \partial _x u - x \partial _y u. \end{aligned}$$
(3.18)

Making use of Theorem 3.6, we see

$$\begin{aligned} {\text {Div}}\left( {\mathscr {C} [u]}\right) = {\text {Div}}\begin{bmatrix} y \left( {\left( {\partial _y u}\right) ^2 - \left( {\partial _x u}\right) ^2}\right) /2 + x \partial _x u \partial _y u + y f u \\ x \left( {\left( {\partial _y u}\right) ^2 - \left( {\partial _x u}\right) ^2}\right) /2 - y \partial _x u \partial _y u - x f u \end{bmatrix} \end{aligned}$$
(3.19)

is a conservation law over solutions of \(\left( {\mathscr {E} \mathscr {L}}\right) [u] = 0\).

Remark 3.10

(trivial Lie group actions) For any variational problem, the Euler–Lagrange equations (2.16), as already mentioned, are given in variational (or divergence) form. As such, if we assume that L does not depend on u, that is, \(L = L(\varvec{x}, \nabla u)\), then the Euler–Lagrange equations themselves are a Noether conservation law. Indeed, consider the case of Example 3.9 with \(f\equiv 0\). It is clear by definition that \(\Delta u = {\text {div}}\left( {\nabla u}\right) = 0\) is a conservation law. It arises from Noether’s Theorem under the trivial Lie group action that of translation in the dependent variable

$$\begin{aligned} \left( {\varvec{x}, u}\right) \rightarrow \left( {\varvec{x}, u + \epsilon }\right) . \end{aligned}$$
(3.20)

For this action, the infinitesimals are \(\varvec{\xi } = \varvec{0}\) and \(\phi = 1\).

4 Noether’s Theorem for Weak Solutions

Noether’s Theorem (Theorem 3.6) as it is stated in Sect. 3 only makes sense for classical solutions of the Euler–Lagrange equations (2.16). We wish to “weaken” the theorem such that it is applicable to extremals which are Lipschitz continuous, the so-called broken extremals [24]. We begin by defining a weaker invariance condition than that of Definition 3.4.

Definition 4.1

(weak variational symmetry) The transformation (3.1) is said to be a weak variational symmetry if

$$\begin{aligned} \int _{\Omega } L\left( {\varvec{x}, u, \nabla u}\right) \,\mathrm {d}\varvec{x} = \int _{\Omega _\Xi } L\left( {\widetilde{\varvec{x}}, \widetilde{u}, \widetilde{\nabla u}}\right) \,\mathrm {d}\widetilde{\varvec{x}} \end{aligned}$$
(4.1)

holds over the domain \(\Omega \).

Remark 4.2

(strong symmetry \(\,\Rightarrow \, \) weak symmetry) We note that any strong variational symmetry is also a weak symmetry, but the converse is not true.

Theorem 4.3

(Noether-type conserved quantities for weak variational symmetries) Suppose that the variational problem (2.7) has a weak variational symmetry. Let \(\phi \) and \(\varvec{\xi }\) be the infinitesimal generators of the symmetry as in Definition 3.2. Then,

$$\begin{aligned} 0&= \int _\Omega { \left( { \partial _3 L }\right) }^{{\varvec{\intercal }}} \mathrm {D}\phi + \partial _2 L \phi + L {\text {Div}}\left( {\varvec{\xi }}\right) + {\left( {\partial _1 L}\right) }^{{\varvec{\intercal }}}\varvec{\xi } - {\left( { \partial _3 L}\right) }^{{\varvec{\intercal }}}\nabla u {\text {Div}}\left( {\varvec{\xi }}\right) \,\mathrm {d}\varvec{x}. \end{aligned}$$
(4.2)

Over smooth minimisers, i.e., if \(u\in {\text {C}} ^{2}(\Omega )\), we have

$$\begin{aligned} 0&= \int _\Omega {\text {Div}}\left( {L \varvec{\xi }}\right) - {\text {Div}}\left( {\partial _3 L}\right) Q + \partial _2 L Q \,\mathrm {d}\varvec{x} + \int _{\partial \Omega } { \left( {Q \partial _3L}\right) }^{{\varvec{\intercal }}}\varvec{n} \,\mathrm {d}s\nonumber \\&= \int _{\partial \Omega } { \left( { Q\partial _3 L + \xi L}\right) }^{{\varvec{\intercal }}}\varvec{n} \,\mathrm {d}s \end{aligned}$$
(4.3)

Remark 4.4

(structure of (4.2)) The weak conservation law given in Theorem 4.3 has a very clear structure. The first two terms in (4.2) represent the weak Euler–Lagrange equations. The last three terms in (4.2) represent the weak conservation law itself.

Proof of Theorem 4.3

Using the fact that the problem (2.7) has a weak variational symmetry, from Definition 4.1 we see that

$$\begin{aligned} 0&= \lim _{\epsilon \rightarrow 0} \frac{1}{\epsilon } \left( { \int _{\Omega _\Xi } L\left( {\widetilde{\varvec{x}}, \widetilde{u}, \widetilde{\nabla u}}\right) \,\mathrm {d}\widetilde{\varvec{x}} - \int _{\Omega } L\left( {\varvec{x}, u, \nabla u}\right) \,\mathrm {d}\varvec{x} }\right) .\nonumber \\&= \lim _{\epsilon \rightarrow 0} \frac{1}{\epsilon } \left( { \int _{\Omega } L\left( {\widetilde{\varvec{x}}, \widetilde{u}, \widetilde{\nabla u}}\right) \frac{\,\mathrm {d}\widetilde{\varvec{x}}}{\,\mathrm {d}\varvec{x}} \,\mathrm {d}\varvec{x} - \int _{\Omega } L\left( {\varvec{x}, u, \nabla u}\right) \,\mathrm {d}\varvec{x} }\right) , \end{aligned}$$
(4.4)

noting that \(\widetilde{x}=\widetilde{x}(x,u(x),\epsilon )\), so that the first integrand is indeed defined on \(\Omega \). Making use of Definition 3.2 and the fact that

$$\begin{aligned} \widetilde{\nabla u} = \nabla u + \epsilon \left( {\mathrm {D}\phi - \nabla u {\text {Div}}{\varvec{\xi }}}\right) + {\text {O}}(\epsilon ^2), \end{aligned}$$
(4.5)

it holds that

$$\begin{aligned} 0 = \int _\Omega {\left( {\partial _1 L}\right) }^{{\varvec{\intercal }}}\varvec{\xi } + \partial _2 L \phi + {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \mathrm {D}\phi - {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \nabla u {\text {Div}}\left( {\varvec{\xi }}\right) + L{\text {Div}}\left( {\varvec{\xi }}\right) \,\mathrm {d}\varvec{x}, \end{aligned}$$
(4.6)

as required for the first equality. The second arises from noting (4.6) implies

$$\begin{aligned} 0&= \int _\Omega {\left( {\mathrm {D}L - \nabla u\partial _2 L - \mathrm {D}^2u\partial _3 L}\right) }^{{\varvec{\intercal }}}\varvec{\xi } + \partial _2 L \phi + {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \mathrm {D}\phi \nonumber \\&\qquad - {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \nabla u {\text {Div}}\left( {\varvec{\xi }}\right) + L{\text {Div}}\left( {\varvec{\xi }}\right) \,\mathrm {d}\varvec{x}\nonumber \\&= \int _\Omega {\text {Div}}\left( {L\varvec{\xi }}\right) + \partial _{2}{L} \left( {\phi - {\left( {\nabla u}\right) }^{{\varvec{\intercal }}}\varvec{\xi }}\right) + {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \left( {\mathrm {D}\phi - \nabla u {\text {Div}}{\varvec{\xi }} - \mathrm {D}^2u \varvec{\xi }}\right) \,\mathrm {d}\varvec{x}. \end{aligned}$$
(4.7)

Using the fact that

$$\begin{aligned} \mathrm {D}Q = \mathrm {D}\phi - \nabla u {\text {Div}}\left( {\varvec{\xi }}\right) - \left( {\mathrm {D}^2u}\right) , \varvec{\xi } \end{aligned}$$
(4.8)

we have that

$$\begin{aligned} 0&= \int _\Omega {\text {Div}}\left( {L\varvec{\xi }}\right) + Q \partial _2 L - Q {\text {Div}}\left( {\partial _3 L}\right) \,\mathrm {d}\varvec{x} + \int _{\partial {\Omega }} Q {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}}\varvec{n} \,\mathrm {d}s \end{aligned}$$
(4.9)

upon applying Stokes Theorem and noting that u is now an extremal hence satisfies the Euler–Lagrange equations (modulo natural boundary conditions).

Note that if the group action is separable, we may separate the proof into computing the inner and outer variations with respect to the infinitesimals of the one-parameter group (see Definition 3.2) [24, c.f.], where the inner variations are with respect to the independent variables and the outer variation with respect to the dependent variables. \(\square \)

Corollary 4.5

(strong conservation law \(\,\Rightarrow \, \) weak conservation law) Let the variational problem (2.7) have a variational symmetry in the sense of Definition 3.4 and that the minimiser to the variational problem is smooth \(u\in {\text {C}} ^{2}(\Omega )\), then (4.2) holds.

Now, we have developed the framework sufficiently to state our main result in this section. Here, we are concerned with broken extremals, that is, functions whose derivatives have finitely many jump discontinuities.

Definition 4.6

(broken extremal) An extremal, \(u\in {\text {Lip}}(\Omega )\), the space of Lipschitz continuous functions over \(\Omega \), to the problem (2.7) is said to be a broken extremal if it is piecewise \({\text {C}} ^{2}(\Omega )\) over the domain \(\Omega \) with bounded derivative. That is, \(\Omega \) can be decomposed into finitely many open subsets, \(\{\Omega _i\}_{i=1}^N\), such that

  1. (1)

    the subsets make up the entire domain, i.e., \(\Omega = \bigcup _{i}\overline{\Omega _i}\),

  2. (2)

    they are non-overlapping, i.e., \(\Omega _i\cap \Omega _j = \emptyset \), and

  3. (3)

    the solution is smooth over each of the subsets, i.e., \(u\in {\text {C}} ^{2}(\Omega _i)\cap {\text {Lip}}(\Omega )\).

Definition 4.7

(skeleton and jumps) We define \(\mathscr {F} \) to be the skeleton of the decomposition, that is,

$$\begin{aligned} \mathscr {F}:= \left\{ \varvec{x}:\;\varvec{x} \in \partial \Omega _i, i=1,\ldots , N \right\} . \end{aligned}$$
(4.10)

We will assume that the domain is decomposed in such a way that the skeleton is Lipschitz continuous. Let \(\varvec{n}_i\) be the outward pointing normal to \(\Omega _i\), we then define jumps of scalars and vector-valued functions as

$$\begin{aligned}&\llbracket v \rrbracket := v|_{\Omega _1} \varvec{n}_1 + v|_{\Omega _2} \varvec{n}_2 \end{aligned}$$
(4.11)
$$\begin{aligned}&\llbracket \varvec{v} \rrbracket := {\left( {\varvec{v}|_{\Omega _1}}\right) }^{{\varvec{\intercal }}} \varvec{n}_1 + {\left( {\varvec{v}|_{\Omega _2}}\right) }^{{\varvec{\intercal }}} \varvec{n}_2, \end{aligned}$$
(4.12)

respectively.

Definition 4.8

(piecewise variational symmetry) Let u be a broken extremal to the variational problem (2.7), and let \(\left\{ {\Omega _i}\right\} _{i=1}^N\) be the decomposed domain of u. Then, the transformation (3.1) is a piecewise variational symmetry if it is a variational symmetry over \(\Omega _i\) for each \(i=1,\ldots , N\).

Proposition 4.9

(a piecewise Noether result) Suppose the variational problem (2.7) has a piecewise variational symmetry. Then,

$$\begin{aligned} {\text {Div}}\left( {\mathscr {C} [u]}\right) = Q \left( {\mathscr {E} \mathscr {L}}\right) [u] \qquad \text { in } \bigcup _{i} \Omega _i. \end{aligned}$$
(4.13)

Hence, over solutions of the Euler–Lagrange equations \(\left( {\mathscr {E} \mathscr {L}}\right) [u] = 0\), we have that

$$\begin{aligned} {\text {Div}}\left( {\mathscr {C} [u]}\right) = 0 \qquad \text { in } \bigcup _{i} \Omega _i. \end{aligned}$$
(4.14)

Proof

The proof consists of applying Theorem 3.6 upon each subdivision \(\Omega _i\). \(\square \)

Definition 4.10

(piecewise weak variational symmetry) Let u be a broken extremal to the variational problem (2.7), and let \(\left\{ {\Omega _i}\right\} _{i=1}^N\) be the decomposed domain of u. Then, the transformation (3.1) is a piecewise weak variational symmetry if it is a weak variational symmetry over \(\Omega _i\) for each \(i=1,\ldots , N\).

Theorem 4.11

(conserved quantities for broken extremals of the variational problem) Suppose the variational problem (2.7) has a piecewise weak variational symmetry. Then,

$$\begin{aligned} 0&= \sum _i \int _{\Omega _i} \left( {-{\text {Div}}\left( {\partial _{3}{L}}\right) + \partial _{2}{L} }\right) \phi - L \partial _{1}{\varvec{\xi }} + {\text {Div}}\left( {\varvec{\xi }}\right) \left( {L - {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \nabla u}\right) \,\mathrm {d}\varvec{x}\nonumber \\&\qquad + \int _{\mathscr {F}} \llbracket L {\varvec{\xi }} + {{\partial _{3}{L}}} \phi \rrbracket \,\mathrm {d}s. \end{aligned}$$
(4.15)

Over broken extremals, we have that

$$\begin{aligned} 0&= \sum _{i=1}^N\int _{\Omega _i} L {\text {Div}}{\varvec{\xi }} - L \partial _{1}{\varvec{\xi }} - {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \nabla u {\text {Div}}{\varvec{\xi }} {\,\mathrm {d}\varvec{x}}\nonumber \\&\qquad + \int _{\mathscr {F}} \llbracket \partial _{3}{L} \phi + L\varvec{\xi } \rrbracket \,\mathrm {d}s. \end{aligned}$$
(4.16)

Proof

The proof of this result follows the same lines as the proof of Theorem 4.3. Using (4.6), we have that for each \(\Omega _i\)

$$\begin{aligned} 0 = \int _{\Omega _i} {\partial _1 L}^{{\varvec{\intercal }}}{\varvec{\xi }} + \partial _2 L \phi + {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \mathrm {D}\phi - {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \nabla u {\text {Div}}\left( {\varvec{\xi }}\right) + L{\text {Div}}\left( {\varvec{\xi }}\right) \,\mathrm {d}\varvec{x}. \end{aligned}$$
(4.17)

Integrating by parts, we see

$$\begin{aligned} 0&= \int _{\Omega _i} \left( {-{\text {Div}}\left( {\partial _{3}{L}}\right) + \partial _{2}{L} }\right) \phi - L \partial _{1}{\varvec{\xi }} - {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \nabla u {\text {Div}}\left( {\varvec{\xi }}\right) + L{\text {Div}}\left( {\varvec{\xi }}\right) \,\mathrm {d}\varvec{x}\nonumber \\&\qquad + \int _{\partial _{}{\Omega } _i} L {\varvec{\xi }}^{{\varvec{\intercal }}} \varvec{n} + {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \varvec{n} \phi \,\mathrm {d}s. \end{aligned}$$
(4.18)

Summing over each of the subdomains, we have

$$\begin{aligned} 0&= \sum _i \int _{\Omega _i} \left( {-{\text {Div}}\left( {\partial _{3}{L}}\right) + \partial _{2}{L} }\right) \phi - L \partial _{1}{\varvec{\xi }} - {\left( {\partial _3 L}\right) }^{{\varvec{\intercal }}} \nabla u {\text {Div}}\left( {\varvec{\xi }}\right) + L{\text {Div}}\left( {\varvec{\xi }}\right) \,\mathrm {d}\varvec{x}\nonumber \\&\qquad + \int _{\mathscr {F}} \llbracket L {\varvec{\xi }} + {\left( {\partial _{3}{L}}\right) } \phi \rrbracket \,\mathrm {d}s, \end{aligned}$$
(4.19)

as required for the first equality. For the second, we note that over broken extremals the Euler–Lagrange equations vanish over each \(\Omega _i\), concluding the proof. \(\square \)

5 Finite Element Conservation Laws

5.1 Discretisation

In this section, we calculate the discrete counterpart to Theorem 4.11 in the finite element context. To that end, let \(\mathscr {T} ^{}\) be a conforming triangulation of \(\Omega \), namely \(\mathscr {T} ^{}\) is a finite family of sets such that

  1. (1)

    \(K\in \mathscr {T} ^{}\) implies K is an open simplex (segment for \(d=1\), triangle for \(d=2\), tetrahedron for \(d=3\)),

  2. (2)

    for any \(K,J\in \mathscr {T} ^{}\), we have that \(\overline{K}\cap \overline{J}\) is a full subsimplex (i.e. it is either \(\emptyset \), a vertex, an edge, a face, or the whole of \(\overline{K}\) and \(\overline{J}\)) of both \(\overline{K}\) and \(\overline{J}\) and

  3. (3)

    \(\bigcup _{K\in \mathscr {T} ^{}}\overline{K}=\overline{\Omega }\).

We let \(\mathscr {E} {}\) be the skeleton (set of internal common interfaces) of the triangulation \(\mathscr {T} ^{}\) and say \(e\in \mathscr {E} \) if e is on the interior of \(\Omega \) and \(e\in \partial \Omega \) if e lies on the boundary \(\partial \Omega \).

The shape regularity of \(\mathscr {T} ^{}\) is defined as

$$\begin{aligned} \mu (\mathscr {T} ^{}) := \inf _{K\in \mathscr {T} ^{}} \frac{\rho _K}{h_K}, \end{aligned}$$
(5.1)

where \(\rho _K\) is the radius of the largest ball contained inside K, and \(h_K\) is the diameter of K. We use the convention where \(h:\Omega \rightarrow \mathbb {R} \) denotes the meshsize function of \(\mathscr {T} ^{}\), i.e.

$$\begin{aligned} h(\varvec{x}):=\max _{\overline{K}\ni \varvec{x}}h_K, \end{aligned}$$
(5.2)

where \(h_K\) is the diameter of an element K. We introduce the finite element spaces

$$\begin{aligned} \mathbb V:= \left\{ \Phi \in {\text {C}} ^{0}(\Omega ) :\;\Phi \vert _{K} \in \mathbb {P} ^{k}\,\forall \,K\in \mathscr {T} ^{} \right\} \end{aligned}$$
(5.3)
(5.4)

where \(\mathbb {P} ^{k}\) denotes the linear space of polynomials in d variables of degree no higher than a positive integer k. We consider \(k\ge 1\) to be fixed and denote by .

The Galerkin approximation to the variational problem (2.7) is to seek such that

(5.5)

The finite element scheme defined by (5.5) is guaranteed to be well posed under some assumptions on L that allow us to invoke the Lax–Milgram Theorem or the more generally applicable inf-sup condition [25]. Henceforth, we make the following blanket assumption which guarantees the continuous minimisation problem admits a unique solution.

Assumption 5.1

(growth conditions and coercivity of L ) Let \(v\in {\text {W}} ^{1}_{p}(\Omega )\) for \(p\in (1,\infty )\); then, we assume there exists a constant \(C_1 > 0, C_2 \ge 0\) such that

$$\begin{aligned}&\left| L\left( {\varvec{x}, v, \nabla v}\right) \right| \le C_1 \left( {\left| \nabla v\right| ^p + \left| v\right| ^p + 1}\right) \quad \,\forall \,\varvec{x} \in \Omega , \end{aligned}$$
(5.6)
$$\begin{aligned}&\quad {L\left( {\varvec{x}, v, \nabla v}\right) } \ge C_1 \left( {\left| \nabla v\right| ^p - C_2}\right) \end{aligned}$$
(5.7)

and that \(L(\varvec{x}, u, \nabla u)\) is convex in the third variable.

We may now proceed to derive a finite element Noether-type conservation law. As already seen, the conservation law arises after taking inner and outer variations of the variational problem. The discrete outer variation can be characterised by the following Lemma.

Lemma 5.2

(discrete Euler–Lagrange equations) The discrete Euler–Lagrange equations associated with the variational minimisation problem (5.5) are to seek such that

(5.8)

where \(L=L(\varvec{x}, U, \nabla U)\) and \({\text {Div}}_{\mathscr {T} ^{}}\) denotes an elementwise (\(\mathscr {T} ^{}\)-wise) \({\text {Div}}\) operator.

Proof

Define the real-valued function which we call the outer variation operator

$$\begin{aligned} o\left( {\epsilon }\right) := \int _\Omega L\left( {\varvec{x}, U + \epsilon V, \nabla U + \epsilon \nabla V}\right) \end{aligned}$$
(5.9)

where is a discrete variation. Since is the discrete minimiser of the energy functional, we certainly have that \(o'(0) = 0\), and we may explicitly compute this quantity, the first outer variation,

$$\begin{aligned} o'(\epsilon )&= \int _\Omega {\partial _{3}{L} {\left( {\varvec{x}, U + \epsilon V, \nabla U + \epsilon \nabla V}\right) }}^{{\varvec{\intercal }}} \nabla V\nonumber \\&\quad + \partial _{2}{L} \left( {\varvec{x}, U + \epsilon V, \nabla U + \epsilon \nabla V}\right) V {\,\mathrm {d}\varvec{x}}. \end{aligned}$$
(5.10)

Note that since \(\nabla U\) is not continuous over the skeleton of the triangulation \(\mathscr {T} ^{}\), we have that spatial derivatives of \(\partial _{3}{L} \) are, in general, not well defined. But \(\nabla U\) is smooth over the interior of each element. We thus split the integral into elementwise contributions and integrate by parts elementwise. For brevity, we note that the Lagrangian \(L = L\left( {\varvec{x}, U, \nabla U}\right) \) and drop the dependency.

$$\begin{aligned} o'(0)&= \sum _{K\in \mathscr {T} ^{}}\int _K {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \nabla V + \left( {\partial _{2}{L}}\right) V {\,\mathrm {d}\varvec{x}}\nonumber \\&= \sum _{K\in \mathscr {T} ^{}}\int _K \left( {-{\text {Div}}_{\mathscr {T} ^{}}\left( {\partial _{3}{L}}\right) + \partial _{2}{L} }\right) V {\,\mathrm {d}\varvec{x}} + \int _{\partial K} {\left( {\partial _{2}{L}}\right) }^{{\varvec{\intercal }}} \varvec{n}_K V {\,\mathrm {d}\varvec{s}}. \end{aligned}$$
(5.11)

We now use the identity

$$\begin{aligned} \sum _{K\in \mathscr {T} ^{}} \int _{\partial K} {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \varvec{n}_K V {\,\mathrm {d}\varvec{s}} = \int _\mathscr {E} \llbracket \partial _{3}{L} \rrbracket {V} {\,\mathrm {d}\varvec{s}}, \end{aligned}$$
(5.12)

and hence

$$\begin{aligned} 0 = o'(0)&= \int _\Omega \left( {-{\text {Div}}_{\mathscr {T} ^{}}\left( {\partial _{3}{L}}\right) + \partial _{2}{L}}\right) V {\,\mathrm {d}\varvec{x}} + \int _\mathscr {E} \llbracket \partial _{3}{L} \rrbracket V {\,\mathrm {d}\varvec{s}} \end{aligned}$$
(5.13)

as required. \(\square \)

Example 5.3

(discrete Laplace’s problem) For example, the discrete Euler–Lagrange equations associated with Laplace’s problem (Example 3.9) are to find such that

(5.14)

where \(\Delta _{\mathscr {T} ^{}}\) is an elementwise Laplacian.

Note that if U is a piecewise linear function, the first term of (5.14) is zero. Hence, in this case, the discrete Laplacian can be completely characterised in terms of the jump of the gradient of U over the internal skeleton.

Definition 5.4

( \({\text {L}} _{2}(\Omega )\) projection operator) We define \({\text {P}}_{\mathbb V} :{\text {L}} _{2}(\Omega )\rightarrow \mathbb V\) such that for each \(w\in {\text {L}} _{2}(\Omega )\), we have

$$\begin{aligned} \int _\Omega {{\text {P}}_{\mathbb V} w} \ {V} {\,\mathrm {d}\varvec{x}} = \int _\Omega {w}{V} {\,\mathrm {d}\varvec{x}} \quad \,\forall \,V\in \mathbb V. \end{aligned}$$
(5.15)

Theorem 5.5

(conserved quantities over \({\text {C}} ^{0}(\Omega )\)-finite element spaces) Let u be the unique weak extrema to the minimisation problem (2.7) and U be its finite element approximation. Suppose that this problem has a piecewise weak variational symmetry. Then, the finite element solution satisfies the following

$$\begin{aligned} 0 = \mathscr {N} [U]&:= \int _\Omega \left( {- {\text {Div}}_{\mathscr {T} ^{}}\left( {\partial _{3}{L}}\right) + \partial _{2}{L}}\right) {\text {P}}_{\mathbb V} \phi + {\text {Div}}{\varvec{\xi }} \left( {L - {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \nabla U }\right) - L \partial _{1}{ \varvec{\xi }} {\,\mathrm {d}\varvec{x}}\nonumber \\&\qquad + { \int _{\mathscr {E}} \llbracket {\partial _{3}{L}} {\text {P}}_{\mathbb V} \phi + L {\varvec{\xi }} \rrbracket \,\mathrm {d}\varvec{s}} + \int _{\partial _{}{\Omega }} {\partial _{3}{L}}^{{\varvec{\intercal }}}{\varvec{n}} {\text {P}}_{\mathbb V} \phi \,\mathrm {d}s \end{aligned}$$
(5.16)

Proof

Recall we have an energy functional of the form

$$\begin{aligned} \mathscr {J} [u] := \int _\Omega L\left( {\varvec{x}, u, \nabla u}\right) {\,\mathrm {d}\varvec{x}}. \end{aligned}$$
(5.17)

We have that a finite element minimiser of this energy functional is continuous over the domain \(\Omega \), but its derivative is not. For clarity, we will assume the group actions are separable, that is, the group action acts on the dependant and independent variables separately, i.e., \(\widetilde{x} = \widetilde{x}(x,\epsilon )\), the inner variation, and \(\widetilde{u} = \widetilde{u}(u,\epsilon )\), the outer variation. For non-separable group actions, the proof of the result can be verified using techniques from the Proof of Theorem 4.3.

Using the argument from the proof of Lemma 5.2 with \({\text {P}}_{\mathbb V} \phi \) as the outer variation, we have

$$\begin{aligned} 0 = \int _\Omega \left( {-{\text {Div}}_{\mathscr {T} ^{}} \left( {\partial _3 L}\right) + \partial _2 L}\right) {\text {P}}_{\mathbb V} \phi \,\mathrm {d}\varvec{x} + \int _\mathscr {E} \llbracket \partial _3 L \rrbracket {\text {P}}_{\mathbb V} \phi \,\mathrm {d}s + \int _{\partial \Omega } \partial _3 L {\text {P}}_{\mathbb V} \phi \,\mathrm {d}s \end{aligned}$$
(5.18)

noting the additional boundary term arising since \({\text {P}}_{\mathbb V} \phi \) is not necessarily compactly supported.

The inner variation can be regarded as a change in variables on the independent variable [24, §3.3]. In a similar calculation to that of the Proof of Theorem 4.3, we let \(i(\epsilon ) = \mathscr {J} [U(\widetilde{\varvec{x}})]\) with \(\widetilde{\varvec{x}} = \varvec{x} + \epsilon \varvec{\xi }\). We again split the integral into subdomains to obtain

$$\begin{aligned} 0&= \lim _{\epsilon \rightarrow 0} \frac{1}{\epsilon }\left( {\mathscr {J} [U(\widetilde{\varvec{x}})] - \mathscr {J} [U(\varvec{x})]}\right) \nonumber \\&= \lim _{\epsilon \rightarrow 0} \frac{1}{\epsilon } \left( {\int _{\Omega _\Xi } L\left( {\widetilde{\varvec{x}}, {U}(\widetilde{\varvec{x}}), \widetilde{\nabla } U\left( {\widetilde{\varvec{x}}}\right) \,\mathrm {d}\widetilde{\varvec{x}} }\right) - \int _\Omega L\left( {{\varvec{x}}, U({\varvec{x}}), \nabla U\left( {{\varvec{x}}}\right) \,\mathrm {d}{\varvec{x}} }\right) }\right) \nonumber \\&= \lim _{\epsilon \rightarrow 0} \frac{1}{\epsilon } \left( { \int _{\Omega } L\left( { \widetilde{\varvec{x}}, {U}(\widetilde{\varvec{x}}), \widetilde{\nabla } U(\widetilde{\varvec{x}}) }\right) \frac{\,\mathrm {d}\widetilde{\varvec{x}}}{\,\mathrm {d}\varvec{x}} \,\mathrm {d}\varvec{x} - \int _{\Omega } L\left( {\varvec{x}, U(\varvec{x}), \nabla U(\varvec{x})}\right) \,\mathrm {d}\varvec{x} }\right) . \end{aligned}$$
(5.19)

We note the coordinate transformation in the first integral from \(\Omega _\Xi \) to \(\Omega \) is nothing other than writing \(\widetilde{\varvec{x}}\) in terms of x and group parameters. So \(\widetilde{\varvec{x}} = \widetilde{\varvec{x}} (\varvec{x}, \epsilon )\).

Computing the quantities elementwise, we have

$$\begin{aligned} 0&= \lim _{\epsilon \rightarrow 0} \frac{1}{\epsilon } \left( { \sum _{K\in \mathscr {T} ^{}} \int _K L\left( { \widetilde{\varvec{x}}, {U}(\widetilde{\varvec{x}}), \widetilde{\nabla } U(\widetilde{\varvec{x}}) }\right) \frac{\,\mathrm {d}\widetilde{\varvec{x}}}{\,\mathrm {d}\varvec{x}} \,\mathrm {d}\varvec{x} - \int _{K} L\left( {\varvec{x}, U(\varvec{x}), \nabla U(\varvec{x})}\right) \,\mathrm {d}\varvec{x} }\right) \nonumber \\&= \sum _{K\in \mathscr {T} ^{}} \bigg ( \int _K - L \partial _{1}{\varvec{\xi }} + {\text {Div}}{\left( {\varvec{\xi }}\right) } \left( {L - {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \nabla U }\right) {\,\mathrm {d}\varvec{x}} + \int _{\partial K} L {\varvec{\xi }}^{{\varvec{\intercal }}} \varvec{n}_K {\,\mathrm {d}\varvec{s}} \bigg )\nonumber \\&= \int _\Omega -{L} \partial _{1}{\varvec{\xi }} + L {\text {Div}}{\varvec{\xi }} + {\left( {\partial _{3}{L}}\right) }^{{\varvec{\intercal }}} \nabla U {\text {Div}}{\varvec{\xi }} {\,\mathrm {d}\varvec{x}} + \int _\mathscr {E} \llbracket L \varvec{\xi } \rrbracket {\,\mathrm {d}\varvec{s}}, \end{aligned}$$
(5.20)

as required. \(\square \)

5.2 Applications to the p-Laplacian

In this section, we give a numerical verification to Theorem 5.5 for a model test problem, that of the p-Laplacian

$$\begin{aligned} -{\text {div}}\left( {\left| \nabla u\right| ^{p-2}\nabla u}\right) = f, \end{aligned}$$
(5.21)

where we will restrict \(p\in (1,\infty )\). The p-Laplacian is the Euler–Lagrange equation of the following minimisation problem: Find \(u\in {\overset{{\scriptscriptstyle \circ }}{{\text {W}}}}{}^{1}_{p}(\Omega )\) such that

$$\begin{aligned} \mathscr {J} _p[u] \le \mathscr {J} _p[v] \quad \,\forall \,v\in {\overset{{\scriptscriptstyle \circ }}{{\text {W}}}}{}^{1}_{p}(\Omega ), \end{aligned}$$
(5.22)

with the (parameterised) action functional \(\mathscr {J} _p\) given by

$$\begin{aligned} \mathscr {J} _p[v] := \int _\Omega \frac{1}{p}\left| \nabla v\right| ^p - fv {\,\mathrm {d}\varvec{x}}. \end{aligned}$$
(5.23)

Note that for \(p=2\), this problem coincides with the standard Laplace’s problem (see Example 3.9). For general p, it is well known that the problem is uniquely solvable.

The discrete weak formulation associated with the minimisation problem (5.22) is to find \(U\in \mathbb V\) such that

$$\begin{aligned} \int _\Omega \left| \nabla U\right| ^{p-2}{\left( {\nabla U}\right) }^{{\varvec{\intercal }}}\nabla V {\,\mathrm {d}\varvec{x}} = \int _\Omega f \ V {\,\mathrm {d}\varvec{x}} \quad \,\forall \,V\in \mathbb V. \end{aligned}$$
(5.24)

In this test, we choose f such that

$$\begin{aligned} u = \sin {({\pi \left| \varvec{x}\right| ^2})} \end{aligned}$$
(5.25)

solves the p-Laplace equation (5.21). We have that f can be written as \(f = f(\left| \varvec{x}\right| )\), and hence, the Lagrangian

$$\begin{aligned} L(\varvec{x}, v, \nabla v) = \frac{1}{p}\left| \nabla v\right| ^p - fv \end{aligned}$$
(5.26)

is invariant under SO(d) group actions.

We fix \(d=2\) and take \(\mathscr {T} ^{}\) to be a structured triangulation of \(\Omega \), the unit circle, as given in Fig. 1.

Fig. 1
figure 1

An example of the triangulation \(\mathscr {T} ^{}\) and the finite element approximation to \(u = \sin {({\pi \left| \varvec{x}\right| ^2})}\), the solution of the p-Laplacian. a An example of the triangulation, here \(\dim {\mathbb V} = 12564\). b The piecewise linear finite element approximation of the solution of the 3-Laplacian

It is well known [26, c.f.] that the finite element approximation (5.24) is well posed and has optimal convergence properties with respect to certain quasinorms. In Table 1, we show errors, convergence rates, and the values of the finite element Noether quantity as written in Theorem 5.5 for various cases of p. The tables also study the experimental order of convergence of the numerical approximation which we now define.

Definition 5.6

(experimental order of convergence) Given two sequences a(i) and \(h(i)\searrow 0\), \(i\in \left[ l:\right] \), we define experimental order of convergence (\({\text {EOC}}\)) to be the local slope of the \(\log a(i)\) versus \(\log h(i)\) curve, i.e.

$$\begin{aligned} {\text {EOC}} (a,h;i):=\frac{ \log (a(i+1)/a(i)) }{ \log (h(i+1)/h(i)) }. \end{aligned}$$
(5.27)
Table 1 In this test, we computationally study the behaviour of the finite element conserved quantity, \(\mathscr {N} [U]\), given in Theorem 5.5

Remark 5.7

(numerical conservation) In the numerical experiments shown in Table 1, we formulated (5.24) as a system of nonlinear equations, the solution to this is then approximated by a Newton method with tolerance set at \(10^{-10}\). At each Newton step, the solution to the linear system of equations is approximated using a stabilised conjugate gradient iterative solver with an algebraic multigrid preconditioner, also set at a tolerance of \(10^{-10}\). Since the solvers themselves only generate approximations to the numerical variational problem, the notion of conservation is only true up to a certain tolerance. In this case, the quantity will be conserved up to the tolerance of the solvers, \(10^{-10}\).

5.3 Mimetic Methods Weakly Enforce Discrete Conservation Laws Which are Derived from Trivial Lie Group Actions

The mimetic finite element framework consists of reformulating the Euler–Lagrange equations as a system of first-order PDEs. Consider our prototypical example for illustrative purposes. Poisson’s problem,

$$\begin{aligned} \Delta u = 0, \end{aligned}$$
(5.28)

is the Euler–Lagrange equation of the minimisation problem

$$\begin{aligned} \mathscr {J} [u] = \int _\Omega \frac{1}{2}\left| \nabla u\right| ^2 {\,\mathrm {d}\varvec{x}} \rightarrow \text { min.} \end{aligned}$$
(5.29)

It can be written in mixed form by introducing an auxiliary variable \(\varvec{p}\) to represent the gradient and rewriting Poisson’s problem to seek \(\left( {u, \varvec{p}}\right) \) such that

$$\begin{aligned} {\text {div}}{\varvec{p}} = 0 \end{aligned}$$
(5.30)
$$\begin{aligned} \varvec{p} = \nabla u. \end{aligned}$$
(5.31)

These are then the Euler–Lagrange equations of the saddle point problem

$$\begin{aligned} \mathscr {K} [u, \varvec{p}] := \int _\Omega \frac{1}{2} \left| \varvec{p}\right| ^2 + u\left( {{\text {div}}{\varvec{p}}}\right) {\,\mathrm {d}\varvec{x}}. \end{aligned}$$
(5.32)

The correct function space setting is to seek \(u\in {\text {L}} _{2}(\Omega )\) and \(\varvec{p} \in {\text {H}} ^{\text {div}}(\Omega ) := \left\{ \varvec{\Psi } :\;{\text {div}}{\varvec{\Psi }} \in {\text {L}} _{2}(\Omega ) \right\} \). A conformal approximation of this problem can be sought using the Raviart–Thomas and piecewise constant finite element pair [19], for example. A sufficient condition for the construction of a conformal finite element space of \({\text {H}} ^{\text {div}}(\Omega )\) is that the jumps of the discrete functions vanish over the skeleton of the domain [27, c.f.]. Recall Remark 3.10 concerned itself with the trivial Lie group action of translation in the dependent variable. For our model problem, we have that \(\nabla u\) is conserved. The mimetic scheme weakly enforces this conservation law.

6 Conservative Properties of Lagrangian FEs for Strong Solutions

In this section, we present results concerning the approximability of the strong continuous conservation laws arising from Theorem 3.6. We examine numerically the behaviour of the Lagrangian finite element method. In this sense, we wish to measure the quantity \({\text {Div}}\left( {\mathscr {C} [U]}\right) \) and evaluate how far it deviates from zero. For clarity of exposition, we will assume henceforth that the continuous minimisation problem takes the form: Find \(u\in {{\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )}\) such that

$$\begin{aligned} \mathscr {J} [u] = \inf _{v\in {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )} \mathscr {J} [v]. \end{aligned}$$
(6.1)

Theorem 6.1

(Bound on the finite element approximation of Noether’s laws.) Let \(u\in {\text {H}} ^{2}(\Omega )\cap {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )\) be a strong extrema to the variational problem (2.7) (and hence a strong solution to the Euler–Lagrange equations (2.16)). Suppose we have that Theorem 3.6 holds under a variational symmetry group with infinitesimals \(\varvec{\xi }\) and \(\phi \).

In addition, assume that we have

$$\begin{aligned} L(\varvec{x}, v, \nabla v) \in {\text {L}} _{\infty }(\Omega ) \quad \,\forall \,v\in {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega ) \end{aligned}$$
(6.2)
$$\begin{aligned} \partial _{3}{L} (\varvec{x},v,\nabla v) \in \left[ {{\text {L}} _{\infty }(\Omega )}\right] ^d \quad \,\forall \,v\in {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega ). \end{aligned}$$
(6.3)

Then, if \(U\in \mathbb V\) is the finite element approximation to u, there exists a constant C such that

$$\begin{aligned} \left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )}&\le C \Bigg ( \left\| L(\varvec{x}, U, \nabla U) - L(\varvec{x}, u, \nabla u) \right\| _{{\text {L}} _{2}(\Omega )} + \left\| \nabla U - \nabla u \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\qquad + \left\| \partial _{3}{L} (\varvec{x},U,\nabla U) - \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\qquad + \left\| \phi (\varvec{x}, U) - \phi (\varvec{x}, u)\right\| _{{\text {L}} _{2}(\Omega )} + \left\| \varvec{\xi }(\varvec{x}, U) - \varvec{\xi }(\varvec{x}, u)\right\| _{{\text {L}} _{2}(\Omega )} \Bigg ). \end{aligned}$$
(6.4)

Proof

We begin by noting that since u is a strong extremal, Theorem 3.6 holds, and we have that \({\text {Div}}\left( {\mathscr {C} [u]}\right) = 0\). Hence,

$$\begin{aligned} \left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )} = \left\| {\text {Div}}\left( {\mathscr {C} [U]}\right) - {\text {Div}}\left( {\mathscr {C} [u]}\right) \right\| _{{\text {H}} ^{-1}(\Omega )}. \end{aligned}$$
(6.5)

Now, we may use the fact that for a generic \(\varphi \in {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )\)

$$\begin{aligned} \left\langle {\text {Div}}\left( {\mathscr {C} [U]}\right) - {\text {Div}}\left( {\mathscr {C} [u]}\right) \,\vert \,\varphi \right\rangle&= -\left\langle {{\mathscr {C} [U] - \mathscr {C} [u]},\nabla \varphi }\right\rangle \nonumber \\&\le \left\| {\mathscr {C} [U] - \mathscr {C} [u]}\right\| _{{\text {L}} _{2}(\Omega )} \left\| \nabla \varphi \right\| _{{\text {L}} _{2}(\Omega )}. \end{aligned}$$
(6.6)

Since \(\varphi \) was generic, we may divide through by \(\left\| \nabla \varphi \right\| \) and take the supremum over \(\varphi \). Then, by the definition of the \({\text {H}} ^{-1}(\Omega )\) norm

$$\begin{aligned} \left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )} \le \left\| \mathscr {C} [U] - \mathscr {C} [u]\right\| _{{\text {L}} _{2}(\Omega )}. \end{aligned}$$
(6.7)

By the definition of the Noether quantity \(\mathscr {C} \) from (3.10), we have that

$$\begin{aligned}&\left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )} \nonumber \\&\quad \le \left\| \phi (\varvec{x}, U) \partial _{3}{L} (\varvec{x},U,\nabla U) - \phi (\varvec{x}, u) \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\qquad + \left\| {\left( {\varvec{\xi }(\varvec{x}, U)}\right) }^{{\varvec{\intercal }}} \nabla U \partial _{3}{L} (\varvec{x},U,\nabla U) - {\left( {\varvec{\xi }(\varvec{x}, u)}\right) }^{{\varvec{\intercal }}} \nabla u \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\qquad +\left\| L(\varvec{x},U,\nabla U) \varvec{\xi }(\varvec{x}, U) - L(\varvec{x},u,\nabla u) \varvec{\xi }(\varvec{x}, u) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad =: \mathscr {I} _1 + \mathscr {I} _2 + \mathscr {I} _3 \end{aligned}$$
(6.8)

where for clarity we have written the dependencies explicitly. Now, for each of the \(\mathscr {I} _i\), we add and subtract appropriate quantities and make use of the triangle inequality. We thus have the following bounds:

$$\begin{aligned} \mathscr {I} _1&\le \left\| \phi (\varvec{x}, U) \left( { \partial _{3}{L} (\varvec{x},U,\nabla U) - \partial _{3}{L} (\varvec{x},u,\nabla u) }\right) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad \, + \left\| \left( { \phi (\varvec{x}, U) - \phi (\varvec{x}, u) }\right) \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )} \end{aligned}$$
(6.9)
$$\begin{aligned} \mathscr {I} _2&\le \left\| \left( { \varvec{\xi }(\varvec{x}, u) - \varvec{\xi }(\varvec{x}, U) }\right) {\left( { \nabla u }\right) }^{{\varvec{\intercal }}} \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad \, + \left\| { \left( {\varvec{\xi }(\varvec{x}, U)}\right) }^{{\varvec{\intercal }}} \left( { \nabla u - \nabla U }\right) \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad \, + \left\| { \left( {\varvec{\xi }(\varvec{x}, U)}\right) }^{{\varvec{\intercal }}} \nabla U \left( { \partial _{3}{L} (\varvec{x},U,\nabla U) - \partial _{3}{L} (\varvec{x},u,\nabla u) }\right) \right\| _{{\text {L}} _{2}(\Omega )} \end{aligned}$$
(6.10)

and

$$\begin{aligned} \mathscr {I} _3&\le \left\| \left( { L(\varvec{x},U,\nabla U) - L(\varvec{x},u,\nabla u) }\right) \varvec{\xi }(\varvec{x}, U) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\qquad + \left\| L(\varvec{x},u,\nabla u) \left( { \varvec{\xi }(\varvec{x}, U) - \varvec{\xi }(\varvec{x}, u) }\right) \right\| _{{\text {L}} _{2}(\Omega )}. \end{aligned}$$
(6.11)

Since \(\phi \) and \(\varvec{\xi }\) are infinitesimals of a Lie group action, they are smooth, and under assumptions (6.2)–(6.3), we have that

$$\begin{aligned} \mathscr {I} _1&\le C \left( { \left\| \partial _{3}{L} (\varvec{x},U,\nabla U) - \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )} + \left\| \phi (\varvec{x}, U) - \phi (\varvec{x}, u) \right\| _{{\text {L}} _{2}(\Omega )} }\right) \nonumber \\ \mathscr {I} _2&\le C\bigg (\left\| \varvec{\xi }(\varvec{x}, u) - \varvec{\xi }(\varvec{x}, U) \right\| _{{\text {L}} _{2}(\Omega )} + \left\| \nabla u - \nabla U \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad \, + \left\| \partial _{3}{L} (\varvec{x},U,\nabla U) - \partial _{3}{L} (\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )}\bigg )\nonumber \\ \mathscr {I} _3&\le C \left( { \left\| L(\varvec{x},U,\nabla U) - L(\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )} + \left\| \varvec{\xi }(\varvec{x}, u) - \varvec{\xi }(\varvec{x}, U) \right\| _{{\text {L}} _{2}(\Omega )} }\right) . \end{aligned}$$
(6.12)

Taking the sum of the \(\mathscr {I} _i\) gives the desired result. \(\square \)

Corollary 6.2

Let the conditions of Theorem 6.1 hold under the same variational symmetry group with infinitesimals \(\varvec{\xi }\) and \(\phi \). In addition, assume that the Lagrangian is sufficiently smooth such that both L and \(\partial _3L \) are (locally) Lipschitz with respect to the second and third variable. Then, the bound (6.4) can be simplified to

$$\begin{aligned} \left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )}&\le C \Bigg ( \left\| U - u \right\| _{{\text {L}} _{2}(\Omega )} + \left\| \nabla U - \nabla u \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\qquad + \left\| \phi (\varvec{x}, U) - \phi (\varvec{x}, u)\right\| _{{\text {L}} _{2}(\Omega )} + \left\| \varvec{\xi }(\varvec{x}, U) - \varvec{\xi }(\varvec{x}, u)\right\| _{{\text {L}} _{2}(\Omega )} \Bigg ). \end{aligned}$$
(6.13)

Remark 6.3

The results of Theorem 6.1 are not just applicable to the finite element solution, but to any function. Indeed, the result is actually a property of the conservation law \(\mathscr {C} [\cdot ]\) rather than the approximation U.

Remark 6.4

(relating to the 2-Laplacian) We may relate Theorem 6.1 to the 2-Laplacian studied in Sect. 5.2. We were considering the case that L was invariant under rotations in the independent variable. In that case, we have that \(\phi \equiv 0\) and that

$$\begin{aligned} \left\| \varvec{\xi }(\varvec{x}, U) - \varvec{\xi }(\varvec{x}, u)\right\| = 0, \end{aligned}$$
(6.14)

since \(\varvec{\xi }\) is independent of u. Now, if we have that \(u\in {\text {W}} ^{1}_{\infty }(\Omega )\) and \(f\in {\text {L}} _{\infty }(\Omega )\), then \(U\in {\text {W}} ^{1}_{\infty }(\Omega )\) and

$$\begin{aligned}&\left\| L(\varvec{x},U,\nabla U) - L(\varvec{x},u,\nabla u) \right\| _{{\text {L}} _{2}(\Omega )} = \left\| \frac{1}{2}\left( {\left| \nabla U\right| ^2 - \left| \nabla u\right| ^2}\right) + fu - fU\right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad \le \frac{1}{2}\left\| \nabla U + \nabla u\right\| _{{\text {L}} _{\infty }(\Omega )}\left\| \nabla U - \nabla u\right\| _{{\text {L}} _{2}(\Omega )} + \left\| f\right\| _{{\text {L}} _{\infty }(\Omega )} \left\| u - U\right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\quad \le C\left( {\left\| \nabla U - \nabla u\right\| _{{\text {L}} _{2}(\Omega )} + \left\| u - U\right\| _{{\text {L}} _{2}(\Omega )}}\right) . \end{aligned}$$
(6.15)

Hence, we may infer that

$$\begin{aligned} \left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )} \le C \left\| \nabla U - \nabla u \right\| _{{\text {L}} _{2}(\Omega )} \le C h \left| u\right| _{{\text {H}} ^{2}(\Omega )}, \end{aligned}$$
(6.16)

since for smooth solutions by Aubin–Nitsche duality arguments, we have \({\left\| u - U\right\| _{{\text {L}} _{2}(\Omega )} \le C\left\| \nabla u - \nabla U\right\| _{{\text {L}} _{2}(\Omega )}}\). In the case where u is more regular, say \(u\in {\text {H}} ^{k+1}(\Omega )\cap {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )\), where k was the degree of the finite element approximation, then

$$\begin{aligned} \left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )} \le C h^k \left| u\right| _{{\text {H}} ^{k+1}(\Omega )}. \end{aligned}$$
(6.17)

We turn our attention to a posteriori control of the quantity \({\mathscr {C} [U]}\).

6.1 An Adaptive Scheme for Convergence to the Smooth Law, Maintaining the Discrete Law

We derive an adaptive finite element scheme based on goal-oriented a posteriori analysis [28, c.f.]. At each stage, the discrete law holds. To simplify the presentation, we will limit our discussion to studying the 2-Laplacian (see Sect. 5.2) noting that what we present here is extendable to nonlinear problems at the expense of additional cumbersome notation. Here, we aim to illustrate the main idea.

Definition 6.5

(goal functional and dual problem) We introduce the goal functional

$$\begin{aligned} g(v) = \int _\Omega \left( {\mathscr {C} [u] - \mathscr {C} [U]}\right) \cdot \nabla v \,\mathrm {d}\varvec{x} \end{aligned}$$
(6.18)

and the formal adjoint problem, find \(z\in {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )\) such that

$$\begin{aligned} \int _\Omega \nabla w \cdot \nabla z \,\mathrm {d}\varvec{x} = g(w) \quad \,\forall \,w\in {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega ). \end{aligned}$$
(6.19)

Using this, we will construct a computable error bound for \(g(u) - g(U)\). The purpose was to try to construct a discrete solution, U, posed over an adaptively refined mesh, \(\mathscr {T} ^{}\), which (quasi) minimises the functional error. One would expect this to better respect the equality “\({\text {Div}}\left( {\mathscr {C} [U]}\right) = 0\)” than in the case of uniform refinement. Note that in view of Cauchy–Schwarz that

$$\begin{aligned} g(u) - g(U)&= \int _\Omega \left( {\mathscr {C} [u] - \mathscr {C} [U]}\right) \cdot \nabla \left( {u - U}\right) \,\mathrm {d}\varvec{x}\nonumber \\&\le \left\| \mathscr {C} [u] - \mathscr {C} [U]\right\| _{{\text {L}} _{2}(\Omega )} \left\| \nabla \left( {u - U}\right) \right\| _{{\text {L}} _{2}(\Omega )}\nonumber \\&\le C h^{2k} \left| u\right| _{{\text {H}} ^{k+1}(\Omega )}^2, \end{aligned}$$
(6.20)

in view of Remark 6.4 so the resultant computable estimator should be of this order.

Lemma 6.6

(a goal-oriented error relation for the conservation law) Let \(u\in {\text {H}} ^{2}(\Omega )\cap {\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )\) be a strong extrema to the variational problem defining Laplace’s problem given in Example 3.9. Suppose that Theorem 3.6 holds and that U is the finite element approximation to u, then for any \(z_h\in \mathbb V\)

$$\begin{aligned} g(u) - g(U) = \sum _{K\in \mathscr {T} ^{}} \left[ { \int _K \left( { f + \Delta U }\right) \left( { z - z_h }\right) \,\mathrm {d}\varvec{x} - \int _{\partial K} \nabla U \cdot \varvec{n}_K \left( {z - z_h}\right) \,\mathrm {d}s }\right] , \end{aligned}$$
(6.21)

where \(\varvec{n}_K\) denotes the outward pointing normal to K and z the solution of the dual problem given in Definition 6.5.

Proof

In view of the definition of the dual problem, we have

$$\begin{aligned} g(u) - g(U)&= \int _\Omega \nabla \left( {u - U}\right) \cdot \nabla z\,\mathrm {d}\varvec{x}\nonumber \\&= \int _\Omega \nabla \left( {u - U}\right) \cdot \nabla \left( {z - z_h}\right) \,\mathrm {d}\varvec{x} \quad \,\forall \,z_h \in \mathbb V\end{aligned}$$
(6.22)

using Galerkin orthogonality. Now, by integrating by parts elementwise, we see

$$\begin{aligned} g(u) - g(U)&= \sum _{K\in \mathscr {T} ^{}} \int _K -\Delta \left( {u - U}\right) \left( {z - z_h}\right) \,\mathrm {d}\varvec{x} + \int _{\partial K} \nabla \left( {u - U}\right) \cdot \varvec{n} \left( {z - z_h}\right) \,\mathrm {d}s, \end{aligned}$$
(6.23)

giving the desired result. \(\square \)

Remark 6.7

(computability of the estimator) The relation given in Lemma 6.6 takes the form of a residual, locally weighted by the dual problem. This is the basis of constructing dual weighted residual estimates. The relation is not a posteriori computable, however, as it depends upon the dual solution z which in turn depends upon the solution u. This presents issues in the practical realisation of the relation as an error estimator.

There are a variety of techniques developed to approximate the dual solution. Typically, such approximations are then substituted directly into the error relation given in Lemma 6.6. Taking \(z \approx \widehat{z}\) for some \(\widehat{z}\in \mathbb V\) is undesirable since the residual vanishes over \(\mathbb V\) in view of Galerkin orthogonality so a higher-order space than \(\mathbb V\) should be used. One successful technique is to approximate the dual solution on a finer mesh or by solving local problems, another is postprocessing using extrapolation [29] or difference quotients and heuristic error indicators [28].

Suppose we have computed the Ritz projection of the dual problem (6.19), Z and that \(\widehat{z}\) is some reconstruction of Z, based on some local postprocessing. We may then split the error relation from Lemma 6.6 into two parts

$$\begin{aligned} g(u) - g(U) = \int _\Omega \nabla \left( {u - U}\right) \cdot \nabla \left( {z - \widehat{z}}\right) + \nabla \left( {u - U}\right) \cdot \nabla \left( {\widehat{z}- Z}\right) \,\mathrm {d}\varvec{x} =: \mathscr {A} _1 + \mathscr {A} _2 \end{aligned}$$
(6.24)

If \(\widehat{z}\) is a better approximation of z than Z is, then one would expect \(\mathscr {A} _1\) to be of higher order than \(\mathscr {A} _2\). Practically, this could lead to \(\mathscr {A} _1\) being neglected, but at coarse mesh scales, the approximation of the dual solution using these techniques without accounting for \(\mathscr {A} _1\) can be unreliable [30], although asymptotically, for fine enough mesh, the approximation would be justifiable [31].

Remark 6.8

(computational cost) In the next theorem, we will present a result based on approximating the dual problem on the same finite element space as the original problem and an a posteriori bound of the resultant approximation, allowing for any of the various postprocessing arguments already mentioned. Practically, at least in the self-adjoint case, the computation of this results in an extra inversion of a linear system which involves the same stiffness matrix as used in the computation of the original problem.

Theorem 6.9

(a computable upper bound for the goal error) Let \(Z\in \mathbb V\) be the finite element approximation of the dual problem given in Definition 6.5 and \(\widehat{z}\) be a postprocessor of Z. Suppose also that u is a strong extrema to the variational problem defining Laplace’s problem given in Example 3.9 and U its finite element approximation. Then, under the conditions of Lemma 6.6, there exists a constant C dependent on the shape regularity of \(\mathscr {T} ^{}\) such that

$$\begin{aligned} \left| g(u) - g(U)\right| \le C \left( {E_1 + E_2 + E_3 }\right) =: C E(U,f) \end{aligned}$$
(6.25)

where

$$\begin{aligned}&E_1 := \left| \sum _{K\in \mathscr {T} ^{}} \left( { \int _K \left( { f + \Delta U }\right) \left( { \widehat{z}- Z }\right) \,\mathrm {d}\varvec{x} - \int _{\partial K} \nabla U \cdot \varvec{n}_K \left( {\widehat{z}- Z}\right) \,\mathrm {d}s }\right) \right| \end{aligned}$$
(6.26)
$$\begin{aligned}&E_2 := \left\| f\right\| _{{\text {L}} _{2}(\Omega )} \left( { \sum _{K\in \mathscr {T} ^{}} h_K^2 \left\| {\text {Div}}\left( {\mathscr {C} [U]}\right) + \Delta \widehat{z}\right\| _{{\text {L}} _{2}(K)} + \sum _{e\in \mathscr {E}} h_e^{3/2} \left\| \llbracket \mathscr {C} [U] + \nabla \widehat{z} \rrbracket \right\| _{{\text {L}} _{2}(e)} }\right) \qquad \qquad \end{aligned}$$
(6.27)
$$\begin{aligned}&E_3 := \left\| f\right\| _{{\text {L}} _{2}(\Omega )} \left( { \sum _{K\in \mathscr {T} ^{}} h_K^2 \left\| \Delta \left( {Z - \mathscr {R} _k \widehat{z}}\right) \right\| _{{\text {L}} _{2}(K)} + \sum _{e\in \mathscr {E}} h_e^{3/2} \left\| \llbracket \nabla \left( {Z - \mathscr {R} _k \widehat{z}}\right) \rrbracket \right\| _{{\text {L}} _{2}(e)} }\right) \end{aligned}$$
(6.28)

and \(\mathscr {R} _k\) is the Ritz projection of z into .

Remark 6.10

(structure of the estimate) The estimate has the following structure. The term \(E_1\) represents the usual term computed in a dual weighted residual estimate, it is exactly the residuals of the problem weighted by the majority of the dual error where \(\widehat{z}\) represents any postprocessed reconstruction chosen. The second term, \(E_2\), represents the dual residual and third term, \(E_3\), a reconstruction error. The second and third terms together quantify the additional error induced by \(\mathscr {A} _1\) from Remark 6.7. They are exactly the terms one would lose by only computing \(\mathscr {A} _1\).

Note that if we naively took , the terms \(E_1\) and \(E_3\) would vanish. What would remain is \(E_2\), the dual residual. As will be shown in the numerical experiments, this term can be the dominant contributor of the estimator.

Proposition 6.11

(trace inequality) We will often use the following trace inequality, that for \(v\in \mathbb V\) there exists a constant \(C>0\) such that

$$\begin{aligned} \left\| v\right\| _{{\text {L}} _{2}(\mathscr {E})} \le C h_K^{-1/2} \left\| v\right\| _{{\text {L}} _{2}(\Omega )}. \end{aligned}$$
(6.29)

Proposition 6.12

(approximation properties of the Scott–Zhang interpolator [25, c.f.§1.130]) Let \(I_k\) denote the Scott–Zhang interpolant into \(\mathbb V\). Then, the following local approximation bounds over elements, \(K\in \mathscr {T} ^{}\) and faces \(e\in \mathscr {E} \) hold:

$$\begin{aligned} \left\| v- I_k v\right\| _{{\text {L}} _{2}(K)} \le C h_K^l \left| v\right| _{{\text {H}} ^{l}({\widehat{K})}} \end{aligned}$$
(6.30)
$$\begin{aligned} \left\| v- I_k v\right\| _{{\text {L}} _{2}(e)} \le C h_e^{l-1/2} \left| v\right| _{{\text {H}} ^{l}(\widehat{K})}, \end{aligned}$$
(6.31)

where \(\widehat{K}\) denotes a patch of an element K, the set of all elements sharing a common vertex with K.

Proof of Theorem 6.9

We begin by using the identity from Lemma 6.6 with \(z_h = \mathscr {R} _k z = Z\), where

$$\begin{aligned} \int _\Omega \nabla {w_h}\cdot \nabla {\mathscr {R} _k z} \,\mathrm {d}\varvec{x} = \int _\Omega \nabla {w_h} \cdot \nabla {z} \,\mathrm {d}\varvec{x} \quad \,\forall \,w_h\in \mathbb V, \end{aligned}$$
(6.32)

\(\mathscr {R} _k\) is the Ritz projection operator and hence Z is the finite element approximation of z over \(\mathbb V\) and recall the goal error satisfies

$$\begin{aligned} g(u) - g(U)&= \int _\Omega \nabla \left( {u - U}\right) \cdot \nabla \left( {z - \widehat{z}}\right) + \nabla \left( {u - U}\right) \cdot \nabla \left( {\widehat{z}- Z}\right) \,\mathrm {d}\varvec{x} =: \mathscr {A} _1 + \mathscr {A} _2. \end{aligned}$$
(6.33)

for any \(\widehat{z}\). We will henceforth assume that \(\widehat{z}\) is computable only from Z, using, for example, any of the methodologies mentioned in Remark 6.7. We immediately have that \(\mathscr {A} _2\) is computable. Turning our attention to the other term, we see using Galerkin orthogonality

$$\begin{aligned} \mathscr {A} _1&= \int _\Omega \nabla \left( {u - U}\right) \cdot \nabla \left( {z - \widehat{z}}\right) \,\mathrm {d}\varvec{x}\nonumber \\&= \int _\Omega \nabla \left( {u - U}\right) \cdot \left( {\nabla \left( {z - \widehat{z}}\right) - \nabla \mathscr {R} _k \left( {z - \widehat{z}}\right) }\right) \,\mathrm {d}\varvec{x}, \end{aligned}$$
(6.34)

where \(\mathscr {R} _k\) is the Ritz projection operator defined in (6.32). Hence,

$$\begin{aligned} \mathscr {A} _1&= \int _\Omega \nabla \left( {u - I_k u}\right) \cdot \left( {\nabla \left( {z - \widehat{z}}\right) - \nabla \mathscr {R} _k \left( {z - \widehat{z}}\right) }\right) \,\mathrm {d}\varvec{x}\nonumber \\&= \int _\Omega \nabla \left( {u - I_k u}\right) \cdot \left( {\nabla \left( {z - \widehat{z}}\right) - \nabla \left( {Z - \mathscr {R} _k\widehat{z}}\right) }\right) \,\mathrm {d}\varvec{x}\nonumber \\&= \int _\Omega \nabla \left( {u - I_k u}\right) \cdot \nabla \left( {z - \widehat{z}}\right) - \nabla \left( {u - I_k u}\right) \cdot \nabla \left( {Z - \mathscr {R} _k\widehat{z}}\right) \,\mathrm {d}\varvec{x}\nonumber \\&=: \mathscr {A} _{1,1} - \mathscr {A} _{1,2}. \end{aligned}$$
(6.35)

The idea now is to make use of the fact we have full elliptic regularity for u. Using the definition of the dual solution

$$\begin{aligned} \mathscr {A} _{1,1}&= \int _\Omega \nabla \left( {u - I_k u}\right) \cdot \nabla \left( {z - \widehat{z}}\right) \,\mathrm {d}\varvec{x}\nonumber \\&= g(u - I_k u) - \int _\Omega \nabla \left( {u - I_k u}\right) \cdot \nabla \widehat{z}\,\mathrm {d}\varvec{x}\nonumber \\&=\int _\Omega \left( {\mathscr {C} [u] - \mathscr {C} [U] - \nabla \widehat{z}}\right) \cdot \nabla \left( {u - I_k u}\right) \,\mathrm {d}\varvec{x}. \end{aligned}$$
(6.36)

Using that \(\mathscr {C} [U]\) is piecewise smooth, we split the integral elementwise and integrate by parts to see

$$\begin{aligned} \mathscr {A} _{1,1}&= \sum _{K\in \mathscr {T} ^{}} \int _K \left( {\mathscr {C} [u] - \mathscr {C} [U] - \nabla \widehat{z}}\right) \cdot \nabla \left( {u - I_k u}\right) \,\mathrm {d}\varvec{x}\nonumber \\&= \sum _{K\in \mathscr {T} ^{}} \int _K -\left( {{\text {Div}}\left( {\mathscr {C} [u]}\right) - {\text {Div}}\left( {\mathscr {C} [U]}\right) - \Delta \widehat{z}}\right) \left( {u - I_k u}\right) \,\mathrm {d}\varvec{x}\nonumber \\&\qquad - \int _{\partial K} \left( {\mathscr {C} [U] - \mathscr {C} [u] + \nabla \widehat{z}}\right) \cdot \varvec{n}_K \left( {u - I_k u}\right) \,\mathrm {d}s. \end{aligned}$$
(6.37)

Noting that \({\text {Div}}\left( {\mathscr {C} [u]}\right) = \llbracket \mathscr {C} [u] \rrbracket = 0\), by applying Cauchy–Schwarz inequality together with approximation properties of \(I_k u\) from Proposition 6.12 and elliptic regularity of u, we have

$$\begin{aligned} \mathscr {A} _{1,1}&\le \sum _{K\in \mathscr {T} ^{}} \left\| {\text {Div}}\left( {\mathscr {C} [U]}\right) + \Delta \widehat{z}\right\| _{{\text {L}} _{2}(K)} \left\| u - I_k u\right\| _{{\text {L}} _{2}(K)} \nonumber \\&\qquad + \sum _{e\in \mathscr {E}} \left\| \llbracket \mathscr {C} [U] + \nabla \widehat{z} \rrbracket \right\| _{{\text {L}} _{2}(e)} \left\| {u - I_k u}\right\| _{{\text {L}} _{2}(e)}\nonumber \\&\le C \left\| \Delta u\right\| _{{\text {L}} _{2}(\Omega )} \left( \sum _{K\in \mathscr {T} ^{}} h_K^2 \left\| {\text {Div}}\left( {\mathscr {C} [U]}\right) + \Delta \widehat{z}\right\| _{{\text {L}} _{2}(K)}\right. \nonumber \\&\left. \qquad + \sum _{e\in \mathscr {E}} h_e^{3/2} \left\| \llbracket \mathscr {C} [U] + \nabla \widehat{z} \rrbracket \right\| _{{\text {L}} _{2}(e)} \right) \nonumber \\&\le C \left\| f\right\| _{{\text {L}} _{2}(\Omega )} \left( \sum _{K\in \mathscr {T} ^{}} h_K^2 \left\| {\text {Div}}\left( {\mathscr {C} [U]}\right) + \Delta \widehat{z}\right\| _{{\text {L}} _{2}(K)}\right. \nonumber \\&\left. \qquad + \sum _{e\in \mathscr {E}} h_e^{3/2} \left\| \llbracket \mathscr {C} [U] + \nabla \widehat{z} \rrbracket \right\| _{{\text {L}} _{2}(e)} \right) . \end{aligned}$$
(6.38)

Note that higher regularity of u allows us to use higher-order approximability of \(I_k u\). Here, we restrict our attention to the case \(u\in {\text {H}} ^{2}(\Omega )\) as additional regularity does not yield further information in the estimate, just additional asymptotic convergence. Similarly, for the second term appearing in (6.35), we have

$$\begin{aligned} \mathscr {A} _{1,2}&= \int _\Omega -{\nabla \left( {Z - \mathscr {R} _k\widehat{z}}\right) } \cdot \nabla \left( {u - I_k u}\right) \,\mathrm {d}\varvec{x}\nonumber \\&= \sum _{K\in \mathscr {T} ^{}} \int _K \Delta \left( {Z - \mathscr {R} _k\widehat{z}}\right) \left( {u - I_k u}\right) \,\mathrm {d}\varvec{x} - \int _{\partial K} \nabla \left( {Z - \mathscr {R} _k\widehat{z}}\right) \cdot \varvec{n}_K \left( {u - I_k u}\right) \,\mathrm {d}s\nonumber \\&\le \sum _{K\in \mathscr {T} ^{}} \left\| \Delta \left( {Z - \mathscr {R} _k\widehat{z}}\right) \right\| _{{\text {L}} _{2}(K)} \left\| u - I_k u\right\| _{{\text {L}} _{2}(K)} \nonumber \\&\quad \,\,+ \sum _{e\in \mathscr {E}} \left\| \nabla \left( {Z - \mathscr {R} _k\widehat{z}}\right) \right\| _{{\text {L}} _{2}(e)} \left\| u - I_k u\right\| _{{\text {L}} _{2}(e)}\nonumber \\&\le C\left\| f\right\| _{{\text {L}} _{2}(\Omega )} \sum _{K\in \mathscr {T} ^{}} h_K^{2}\left\| \Delta \left( {Z - \mathscr {R} _k\widehat{z}}\right) \right\| _{{\text {L}} _{2}(K)} + \sum _{e\in \mathscr {E}} h_e^{3/2}\left\| \nabla \left( {Z - \mathscr {R} _k\widehat{z}}\right) \right\| _{{\text {L}} _{2}(e)}. \end{aligned}$$
(6.39)

Combining (6.33), (6.38), and (6.39) yields the desired result. \(\square \)

Fig. 2
figure 2

Examples of the construction of \(\widehat{z}\) as a piecewise quadratic function. In both cases, the red degree of freedom represents \(\varvec{x}_i\), and the green is the set \(\widehat{K_{\varvec{x}_i}}\). a Here, \(\varvec{x}_i\) is a vertex node. b Here, \(\varvec{x}_i\) is a edge node

Fig. 3
figure 3

In this experiment, we consider the 2-Laplacian (Example 3.9). We fix f such that u is known. We solve the discrete problem on concurrently refined meshes and compute the \({\text {L}} _{2}(\Omega )\)-error, the \({{\overset{{\scriptscriptstyle \circ }}{{\text {H}}}}{}^{1}(\Omega )}\)-error, and the computable estimate given in Theorem 6.9. Notice that in the examples where the Lagrangian is invariant under rotations, the computable estimator E(Uf) converges like \({\text {O}}\left( {h^{3}}\right) \) which is faster than predicted in (6.20). a In this example, we choose f such that u is given by (6.41) and \(k=1\). In this case, the underlying Lagrangian is invariant under rotations. b As Fig. 4a with \(k=2\). c In this example, we choose f such that u is given by (6.42) and \(k=1\). In this case, the underlying Lagrangian is invariant under rotations. d In this example, we choose f such that \(u(x,y) = \sin {y}\) and \(k=1\). In this case, the underlying Lagrangian is invariant under translations

Remark 6.13

(standard a posteriori estimates) The error relation (6.16) implies that one may also make use of standard a posteriori estimates which control the gradient error. These take the form

$$\begin{aligned} \left\| \nabla u - \nabla U\right\| _{{\text {L}} _{2}(\Omega )} \le C\left( {\sum _{K\in \mathscr {T} ^{}} h_K^2 \left\| f + \Delta U\right\| _{{\text {L}} _{2}(K)}^2 + \sum _{e\in \mathscr {E}} h_e \left\| \llbracket \nabla U \rrbracket \right\| _{{\text {L}} _{2}(e)}^2 }\right) ^{1/2}. \end{aligned}$$
(6.40)

Although these are more computationally efficient to compute, they result in worse error control of \(\left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{{\text {H}} ^{-1}(\Omega )}\) than the one presented in Theorem 6.9. The approximation of the dual problem gives additional information pertaining to the locality of the error and allows the construction of a better optimised mesh (Fig. 2).

Fig. 4
figure 4

In this experiment, we fix f such that u is given by the “flower” function (6.41). We solve the FE problem on adaptively refined meshes where the estimate of Noether’s conservation law, \(E\left( {U,f}\right) \), is used as a refinement criteria. a The adaptive solution over the final mesh. b The initial mesh. c The mesh after five adaptive iterations. d The mesh after 12 adaptive iterations

Fig. 5
figure 5

In this experiment, we fix f such that u is given by the “Mexican hat” function (6.42). We solve the FE problem on adaptively refined meshes where the estimate of Noether’s conservation law, \(E\left( {U,f}\right) \), is used as a refinement criteria. a The adaptive solution over the final mesh. b The initial mesh. c The mesh after five adaptive iterations. d The mesh after 12 adaptive iterations

6.1.1 Numerical Experiments

In Fig. 3, we show numerically the asymptotic behaviour of the estimate given in Theorem 6.9. All numerical experiments are conducted on the 2-Laplacian taking \(\Omega = B(0,1)\) which is discretised using an unstructured triangulation (as shown in Fig. 2b). We fix f such that u is given by either

(6.41)
(6.42)

The reconstruction \(\widehat{z}\) appearing in the estimates is obtained by using a higher-dimensional space. We take

$$\begin{aligned} \widehat{\mathbb V} = \left\{ \Phi \in {\text {C}} ^{0}(\Omega ) :\;\Phi \vert _{K} \in \mathbb {P} ^{k+1}\,\forall \,K\in \mathscr {T} ^{} \right\} , \end{aligned}$$
(6.43)

that is, a piecewise polynomial space one degree higher than \(\mathbb V\) and define \(\widehat{z}\in \widehat{\mathbb V}\) through

$$\begin{aligned} \widehat{z}(\varvec{x}_i) = \frac{1}{\text {card}(\widehat{K_{\varvec{x}_i}})}\sum _{\varvec{x}_j \in \widehat{K_{\varvec{x}_i}}} Z(\varvec{x}_j) \qquad i = 1, \ldots , \dim \left( {\widehat{\mathbb V}}\right) . \end{aligned}$$
(6.44)

Given a fixed degree of freedom \(\varvec{x}_i\) from \(\widehat{\mathbb V}\), the set \(\widehat{K_{\varvec{x}_i}}\) is the collection of degrees of freedom from \(\mathbb V\) who live on a common element to \(\varvec{x}_i\) (see Fig. 4). The reconstruction is nothing but a local averaging of the finite element solution Z.

6.1.2 Results

We conclude our numerical experiments by using the computable estimator E(Uf) defined in Theorem 6.9 to construct an adaptive scheme aimed at minimising E(Uf) (and hence \(\left\| {\text {Div}}{\mathscr {C} [U]}\right\| _{-1}\)). The adaptive algorithm we make use of is of standard type (SOLVE \(\rightarrow \) ESTIMATE \(\rightarrow \) MARK \(\rightarrow \) REFINE [32, c.f.]) utilising the maximum strategy marking and newest vertex bisection refinement. We conduct two experiments on different test problems. In both cases, we take \(\Omega \) as a polygonal approximation of the unit disc, the degree of the finite element method \(k=1\), and we fix the function \(f= f(\left| \varvec{x}\right| )\) in the Lagrangian, so that the classical solution u is known.

In Fig. 2, we choose u given by (6.41). We ran both uniform and adaptive strategies described above and found when using an adaptively refined mesh with \(\dim {\mathbb V} = 4988\) the estimator \(E\left( {U,f}\right) \approx 0.15\) and the gradient error \(\left\| \nabla u - \nabla U\right\| \approx 1.57\). Under uniform refinement when \(\dim {\mathbb V} = 8107\), we found the estimator \(E\left( {U,f}\right) \approx 0.15\) and the gradient error \(\left\| \nabla u - \nabla U\right\| \approx 1.64\).

In Fig. 5, we choose u given by (6.42). We again ran both uniform and adaptive strategies and found when using an adaptively refined mesh with \(\dim {\mathbb V} = 7885\), the estimator \(E\left( {U,f}\right) \approx 0.00019\) and the gradient error \(\left\| \nabla u - \nabla U\right\| \approx 0.056\). Under uniform refinement when \(\dim {\mathbb V} = 32269\), we found the estimator \(E\left( {U,f}\right) \approx 0.00019\) and the gradient error \(\left\| \nabla u - \nabla U\right\| \approx 0.064\).

In both cases, the adaptive approximation saves a significant number of degrees of freedom. The adaptive solution respects the conserved quantity better than the uniform refinement case and further yields a better approximation of the solution itself.

7 Conclusions and Outlook

In this work, we have generalised Noether’s first Theorem by proving a Noether-type Theorem for a specific class of weak extrema to a model variational problem, that is, those which are Lipschitz continuous. We have in addition proved an equivalent discrete theorem for the finite element approximation of the problem. We write the exact conserved quantity for the discrete scheme in the same spirit as [10]. We have demonstrated that the Lagrangian finite elements enjoy the property of asymptotically conserving the strong Noether conservation laws when approximating strong solutions of certain classes of variational problem and Lie group action.

In addition, we have studied the exact discrete conserved quantities numerically. These are conserved irrespective of whether the underlying symmetry is built into underlying mesh. We have also constructed a geometric-based adaptive scheme to conserve the approximate continuous quantities up to a user-specified tolerance. This means that upon each adaptive step, there is a discrete conserved quantity which can be taken as close to the continuous counterpart as the user specifies, at the expense of solving an additional problem.