1 Introduction

We consider the optimal control problem

$$\begin{aligned} u_0^{min} ={arg\,min}_{u_0 \in {\mathcal {U}}_{ad}} J(y(T,\cdot ;u_0),y_d) \end{aligned}$$
(1)

with the tracking-type functional

$$\begin{aligned} J(y(T,\cdot ;u_0),y_d) =&\;\int _{I} G(y(T,x;u_0),y_d(x))\, dx, \end{aligned}$$
(2)

where

$$\begin{aligned} G(y(T,x;u_0),y_d(x)) =&\; \frac{1}{2} |y(T,x;u_0)-y_d(x)|^2 \end{aligned}$$
(3)

and \(y=y(T,x;u_0)\) is the scalar entropy solution at the final time \(T>0\) of the nonlinear hyperbolic conservation law (later referred to as state equation)

$$\begin{aligned} \begin{aligned} \partial _ty + \partial _xf(y) =&\; 0, \quad \quad (t,x) \in \Omega _T :=(0,T] \times {\mathbb {R}}, \\ y(0,x) =&\; u_0(x), \ x \in {\mathbb {R}}. \end{aligned} \end{aligned}$$
(4)

Here, \(u_0\in {\mathcal {U}}_{ad}\subseteq L^{\infty }({\mathbb {R}})\) is the control and \(y_d \in L^2({\mathbb {R}})\) denotes a given target towards which we strive to optimize. We assume that the flux function satisfies \(f \in C^m({\mathbb {R}})\) with sufficiently large \(m\in {\mathbb {N}}\) and is convex, the admissible set \({\mathcal {U}}_{ad}\) is non-empty, convex and closed, and the region of integration I in (2) is a bounded interval. Weak solutions to (4) are in general not unique, which implies that the physically relevant solution has to be chosen. As a fact we cite the well-known result from [21], which states that for \(u_0 \in L^{\infty }({\mathbb {R}}) \cap BV({\mathbb {R}})\) there exists a unique entropy solution in the sense of Krǔzkov in the class \(C([0,T],L_{loc}^1({\mathbb {R}})) \cap L^{\infty }({\mathbb {R}}\times [0,T])\). Using well-posedness and compactness properties of this solution, the existence of a minimizer \(u_0^{min}\) in (1) can be established under some natural additional assumptions on the class of admissible data \({\mathcal {U}}_{ad}\), see e.g. [5, Theorem 2.1] and [16, Proposition A.1]. In general, uniqueness is not guaranteed due to the occurrence of discontinuous solutions in (4), which can be equal for different initial values. An illustrative example can be found in [5]. These statements generalize to the case where a regularization term \({\mathcal {R}}(u)\) is added to the objective function [28].

In this work, we focus on the numerical treatment of optimal control problems (1) governed by hyperbolic conservation laws, which has been studied amongst others in [1, 2, 5, 6, 10,11,12, 16,17,18, 22,23,24, 29,30,31]. We will follow the first-disretize-then-optimize approach, i.e., Eq. (4) is first discretized in space and time by applying a weighted essentially non-oscillatory (WENO) scheme and a strong stability preserving Runge-Kutta (SSPRK) method. This leads to a finite dimensional optimal control problem, for which the first-order discrete optimality system can be derived and solved by existing optimization solvers such as nonlinear Newton-type algorithms. In spite of the large size of the resulting problems, the flexibility of this approach naturally allows the incorporation of additional constraints and bounds. Further advantages are the direct use of automatic differentiation techniques and the computation of discrete adjoints, which are consistent with the discrete optimal control problem. Symmetric approximations of Hessian matrices can be easily derived and result in a computational speedup.

The application of common methods from nonlinear optimization requires the computation of directional derivatives of the target functional J with respect to the control. An efficient computation of the gradient can be effectuated by using the so-called adjoint approach, in which the derivative is represented via the adjoint state. The crucial issue of hyperbolic conservation laws is the possible formation of shocks even for smooth initial data, for which reason the classical adjoint calculus does not apply. To overcome these difficulties, nonstandard variational concepts have been developed in [4, 29, 31], which incorporate the shock sensitivity in order to derive rigorous optimality conditions. The resulting non-conservative equation has been studied in [3, 7, 29]. Their numerical resolution is intricate, since the interior boundary condition defined on a set of Lebesgue measure zero—existing for the continuous setting—is not present for the discrete counterpart. This inherent problem has been addressed in [1, 10,11,12]. The theory is, however, restricted to differentiable monotone schemes which have sufficiently large numerical diffusion and are of first order only.

To avoid unwanted smearing of the solution by large numerical diffusion and to overcome the lower order restriction of monotone schemes, we propose a novel approach based on WENO schemes introduced in [20, 25, 26]. These schemes have proven to approximate hyperbolic equations comprising both shocks and complex smooth solution structure with higher accuracy and adequate stability along with good resolution around discontinuities. Although these methods are formally first-order accurate when a shock is present, they still have uniform high-order accuracy right up to the shock location. WENO schemes are extensions of the ENO procedure, i.e., they perform essentially non-oscillatory, but overcome shortcomings of the ENO approximation, see [27] for a detailed discussion. By employing a global flux-splitting, the numerical flux function becomes classically differentiable and therefore allows to develop discrete adjoint WENO methods of higher order. Since the third-order WENO method is often applied in applications, we consider this method in the context of optimal control in more detail. We prove that the discrete adjoint WENO3 method is third-order consistent in space for smooth solutions. A fully discrete method is derived by applying a third-order SSPRK method. We present numerical results and study the approximation behaviour of the adjoint WENO3 scheme. Finally, we solve an optimal control problem with discontinuous target and compare the performance of our novel scheme to common first-order schemes such as the modified Lax–Friedrichs and the Engquist–Osher scheme. Further examples can be found in [9].

2 Adjoint equation and reversible solutions

In this section, we briefly recall some theoretical basics in order to set up appropriate adjoint equations for hyperbolic conservation laws. As pointed out in [4, Example 1], the solution operator \(S_t: u_0 \mapsto y(t,\cdot ;u_0)\) is generically not differentiable in \(L_{loc}^1({\mathbb {R}})\), for which reason the classical adjoint calculus does not apply. However, in [29] it has been shown that entropy solutions to hyperbolic conservation laws admit a generalized differentiable structure called shift-differentiability. Under suitable assumptions, a generalized Taylor expansion in \(L_{loc}^1\) of the form

$$\begin{aligned} y(t, \cdot ; u_0 +\delta u_0) =y(t,\cdot ; u_0)+ S_y^{(x_i)}(T_{u_0}(\delta u_0))(\cdot ) +o(\Vert \delta u_0 \Vert _{L^{\infty }({\mathbb {R}})} ) \end{aligned}$$
(5)

exists for all \(\delta u_0 \in L^{\infty }({\mathbb {R}})\), where \(T_u: \delta u \in L^\infty ({\mathbb {R}}) \mapsto (\delta y^T, \delta x_1,\ldots ,\delta x_N) \in L^r(I)\times {\mathbb {R}}^N\), \(r\in (1,\infty ]\), is a bounded linear operator and \(S_y^{(x_i)}\) is the shift variation defined by

$$\begin{aligned} S_y^{(x_i)}(\delta y^T, \delta x_1,\ldots ,\delta x_N)(x) = \delta y^T(x)+ \sum _{i=1}^N (y(T,x_i-)-y(T,x_i+))\,\text {sign}(\delta x_i) {\varvec{1}}_{\Omega _i}, \end{aligned}$$
(6)

where \(\Omega _i=[\min (x_i,x_i+\delta x_i),\max (x_i,x_i+\delta x_i)]\) and \(x_1,\ldots ,x_N\) denote the locations of the down-jumps of the entropy solution. The important advantage of shift-variations is that this framework allows to develop an adjoint calculus for hyperbolic conservation laws by using an averaged sensitivity equation which avoids the linearization of (4) in the usual way, see [31] for further details. The directional derivative of J in (2) in the direction of \(\delta u_0\) can then be represented by

$$\begin{aligned} \partial _{u_0}J(y(T,\cdot ;u_0),y_d) \, \delta u_0(\cdot ) = \int _I p(0,x)\, \delta u_0(x)\, dx, \end{aligned}$$
(7)

where p is the solution of the adjoint equation

$$\begin{aligned} \begin{aligned} \partial _tp + f'(y)\partial _xp =&\; 0, \quad \quad (t,x) \in \Omega _T, \\ p(T,x) =&\; p^T(x), \ x \in {\mathbb {R}}. \end{aligned} \end{aligned}$$
(8)

Here, \(p^{T}(x)\) is given by

$$\begin{aligned} \begin{aligned} p^T(x) = \left\{ \begin{array}{ll} \displaystyle \frac{[G(y(T,x),y_d(x))]}{[y(T,x)]}, &{} x \in X_s , \\ \partial _yG(y(T,x),y_d(x)), &{} \text {otherwise}, \end{array}\right. \end{aligned} \end{aligned}$$
(9)

where \(X_s\) is the set of locations where \(y(T,\cdot )\) possesses a shock and \([w(x)] :=w(x-)-w(x+)\), which naturally incorporates the shock sensitivity.

Equation (8) is a linear transport equation with, in general, discontinuous coefficients. It admits multiple solutions, which requires the selection of the correct adjoint state. This is achieved by so-called reversible solutions that are defined along generalized characteristics [8]. An illustrative demonstration is given in Example 2.1. Under suitable technical assumptions and for appropriate end data \(p^T\) it can be shown that there exists a unique reversible solution to (8) that is bounded, \(L^{\infty }\)-stable, and TV-stable [29, Theorem 4.2.10 and Corollary 4.2.11]. In what follows, we will work with formulation (8) to derive a discrete adjoint WENO3 method.

Example 2.1

Let \(f(y)= \frac{1}{2} y^2\), \(u_0(x)=-\text {sign}(x)\), \(T=0.5\), and \(y_d(x)=0\) with \(x \in {\mathbb {R}}\). It is well-known that the unique entropy solution is given by \(y(t,x)=-\text {sign}(x)\), \(t \in [0,T]\), and hence we have

$$\begin{aligned} p^{T}(x)= \left\{ \begin{array}{ll} 0, &{} x = 0, \\ -\text {sign}(x), &{} x \ne 0. \end{array}\right. \end{aligned}$$
(10)

The area that is not occupied by the classical characteristics is called shock funnel. It is represented by the grey-coloured triangle in Fig. 1. In this region, p takes the constant value zero. The adjoint remains constant along the classical backwards characteristics outside this region. Hence, the reversible solution p is given by

$$\begin{aligned} p(0,x) = {\left\{ \begin{array}{ll} \quad 1 &{} , x<-\frac{1}{2}, \\ \quad 0 &{} , -\frac{1}{2} \le x \le \frac{1}{2}, \\ \,- 1 &{} , \frac{1}{2}<x. \end{array}\right. } \end{aligned}$$
(11)
Fig. 1
figure 1

Construction of the reversible solution p(tx) from the end data \(p^T(x)\) at \(T=0.5\). The shock funnel region is accentuated as grey-colored triangle

3 Discrete adjoint WENO3 method

In order to discretize (4) in space, we now consider solutions with compact support [ab] for the entire time interval [0, T] and take \(y(a)=y(b)=0\) as boundary conditions. Thus, using as many stencil points outside as needed and the compactness of the solution, the implementation of the zero boundary condition does not bear any difficulty. The interval [ab] is partitioned into subintervals \([x_{j-1/2},x_{j+1/2}]\) of the same size \(\Delta x\) and with midpoints \(x_{j}\) for \(j=1,\ldots ,N\). Setting \({\varvec{u}}_{0}:=(u_0(x_1),\ldots ,u_0(x_N))^{\mathsf{T}}\) and defining spatial approximations \({\varvec{y}}(t):=(y_1(t),\ldots .,y_N(t))^{\mathsf{T}}\) with \(y_j(t) \approx y(t,x_j)\), a spatial semi-discretization of (4) reads

$$\begin{aligned} {\varvec{y}}'(t) = -F_{\Delta x} ({\varvec{y}}(t)), \quad {\varvec{y}}(0) = {\varvec{u}}_{0} \in {\mathbb {R}}^N, \end{aligned}$$
(12)

where the nonlinear operator \(F_{\Delta x}: {\mathbb {R}}^N \rightarrow {\mathbb {R}}^N\) represents the discretization of \(\partial _xf(y)\). We choose a conservative finite difference

$$\begin{aligned} (F_{\Delta x}({\varvec{y}}(t)))_j = \frac{1}{\Delta x} \Bigl ({\hat{f}}_{j+\frac{1}{2}}- {\hat{f}}_{j-\frac{1}{2}} \Bigr ), \end{aligned}$$
(13)

where \({\hat{f}}_{j+1/2}: {\mathbb {R}}^m \rightarrow {\mathbb {R}}\) denotes the numerical flux at \(x_{j+1/2}\), which is (at least) a Lipschitz continuous function of m neighboring values \(y_i(t)\). In order to avoid the convergence of the scheme towards entropy violating solutions, we apply a global flux splitting

$$\begin{aligned} {\hat{f}}_{j+\frac{1}{2}} = {\hat{f}}^{+}_{j+\frac{1}{2}} + {\hat{f}}^{-}_{j+\frac{1}{2}}. \end{aligned}$$
(14)

Using the simple Lax–Friedrichs splitting \(f^{\pm }(y)=(f(y) \pm \alpha y)/2\) with \(\alpha :=\max _{u} |f'(u)|\) yields the desired properties \((f^+)'(y)\ge 0\) and \((f^-)'(y)\le 0\). Then, the numerical flux functions of the WENO3 method [26] are defined by

$$\begin{aligned} {\hat{f}}^+_{j+\frac{1}{2}}(y_{j-1},y_j,y_{j+1}) :=&\; \omega _1^+ \Bigl (-\frac{1}{2} f^+(y_{j-1}) + \frac{3}{2} f^+(y_j) \Bigr ) + \omega _2^+ \Bigl (\frac{1}{2} f^+(y_j) + \frac{1}{2} f^+(y_{j+1}) \Bigr ), \end{aligned}$$
(15)
$$\begin{aligned} {\hat{f}}^-_{j-\frac{1}{2}}(y_{j-1},y_j,y_{j+1}) :=&\; \omega _2^- \Bigl (-\frac{1}{2} f^-(y_{j+1}) + \frac{3}{2} f^-(y_j) \Bigr ) + \omega _1^- \Bigl (\frac{1}{2} f^-(y_{j}) + \frac{1}{2} f^-(y_{j-1}) \Bigr ), \end{aligned}$$
(16)

where the weights are

$$\begin{aligned} \omega _m^\pm =\frac{{\tilde{\omega }}_m^\pm }{\sum _{i=1,2} {\tilde{\omega }}_i^\pm }, \quad {\tilde{\omega }}_m^\pm = \frac{\gamma _m^\pm }{(\varepsilon + \beta _m^\pm )^2}, \quad m=1,2. \end{aligned}$$
(17)

The smoothness indicators are given by

$$\begin{aligned} \beta _1^\pm =(f^\pm (y_j)-f^\pm (y_{j-1}))^2, \ \beta _2^\pm = (f^\pm (y_{j+1})-f^\pm (y_j))^2 \end{aligned}$$
(18)

and the linear weights are set to \(\gamma _1^+\!=\!\gamma _2^-\!=\!1/3,\;\gamma _1^-\!=\!\gamma _2^+\!=\!2/3\). Note that \(0<\varepsilon \ll 1\) is chosen in order to avoid the denominator becoming zero. It is set to \(\varepsilon \!=\!10^{-6}\) in our numerical calculations. We would like to emphasize the observation that by construction the numerical fluxes \({\hat{f}}^\pm\) have the same smoothness dependency on its arguments as that of the physical flux function f(y).

Next we will derive the associated adjoint WENO3 scheme. Let \(f\in C^2({\mathbb {R}})\), i.e., there exists the Fréchet derivative of \(F_{\Delta x}\) defined in (13). The continuous optimal control problem is approximated by

$$\begin{aligned} {\varvec{u}}_0^{min} = {arg\,min} _{{\varvec{u}}_0 \in U_{ad}} \sum _{j=1}^{N} G(y_j(T),y_d(x_j)), \end{aligned}$$
(19)

where \(U_{ad}=\{{\varvec{u}}\in {\mathbb {R}}_N:\,\text {TV}({\varvec{u}})\le C\}\) is the discrete admissible set. Then, applying the common Lagrangian approach in \({\mathbb {R}}^N\) with multipliers \({\varvec{p}}(t)=(p_1(t),\ldots ,p_N(t))^{\mathsf{T}}\), the adjoint equation to (12) reads

$$\begin{aligned} {\varvec{p}}'(t) = \nabla _{{\varvec{y}}}F_{\Delta x} ({\varvec{y}}(t))^{\mathsf{T}}\,{\varvec{p}}(t), \quad {\varvec{p}}(T) = \left( \partial _yG(y_j(T),y_d(x_j))\right) _{j=1,\ldots ,N}, \end{aligned}$$
(20)

where \(\nabla _{{\varvec{y}}}F_{\Delta x}\) is the Fréchet derivative of \(F_{\Delta x}\) and gradients are treated as row vectors. The initial condition (the adjoint equation works backwards in time) is the discrete counterpart to (9). Observe that the interior boundary condition does not appear here. A short calculation yields the componentwise description

$$\begin{aligned} p_j'(t) = \frac{1}{\Delta x}\sum _{i=-2}^{2} \partial _{y_j}L_{i,j}({\varvec{y}}(t))\,p_{j+i}(t), \quad j=1,\ldots ,N, \end{aligned}$$
(21)

with the coefficients

$$\begin{aligned} \begin{aligned} L_{-2,j}({\varvec{y}}) =&\; {\hat{f}}^{-}_{j-3/2}, \\ L_{-1,j}({\varvec{y}}) =&\; {\hat{f}}^{+}_{j-1/2} + {\hat{f}}^{-}_{j-1/2} - {\hat{f}}^{-}_{j-3/2}, \\ L_{0,j}({\varvec{y}}) =&\; {\hat{f}}^{+}_{j+1/2} + {\hat{f}}^{-}_{j+1/2} - {\hat{f}}^{+}_{j-1/2} - {\hat{f}}^{-}_{j-1/2}, \\ L_{1,j}({\varvec{y}}) =&\; {\hat{f}}^{+}_{j+3/2} - {\hat{f}}^{+}_{j+1/2} - {\hat{f}}^{-}_{j+1/2}, \\ L_{2,j}({\varvec{y}}) =&\; -{\hat{f}}^{+}_{j+3/2} . \end{aligned} \end{aligned}$$
(22)

The indices of the numerical flux functions are directly related to their arguments, e.g. \({\hat{f}}^{+}_{j+3/2}(y_j,y_{j+1},y_{j+2})\) due to (15). For later use, we note that \(\sum _{i=-2,\ldots ,2}L_{i,j}({\varvec{y}})=0\).

We will now study the consistency order of the adjoint WENO3 scheme, i.e., how accurately does the semi-discretization (20) approximate the continuous adjoint equation (8) in the case of smooth solutions. Inserting exact solution values \(p(t,x_j)\) and \(y(t,x_j)\) (still denoted by \(y_j\) to simplify notation) in the semi-discrete scheme (21) gives the residual-type local spatial errors

$$\begin{aligned} r_j(t) = \partial _t p(t,x_j) - \frac{1}{\Delta x}\sum _{i=-2}^{2} \partial _{y_j}L_{i,j}({\varvec{y}}(t))\,p(t,x_{j+i}). \end{aligned}$$
(23)

Taylor expansion around \(x_j\) yields

$$\begin{aligned} r_j(t) = \partial _t p(t,x_j) - \sum _{k\ge 0} \Delta x^k \frac{1}{(k+1)!}\sum _{i=-2}^{2} i^{k+1}\,\partial _{y_j}L_{i,j}({\varvec{y}}(t)) \,\partial _x^{k+1}p(t,x_j), \end{aligned}$$
(24)

where we have already used that the sum of the \(L_{i,j}\) disappears. The method is said to have adjoint consistency order q if \(r_j(t)={\mathcal {O}}(\Delta x^q)\). In what follows, we will show that the adjoint WENO3 scheme satisfies all conditions for order \(q=3\).

First, we have to calculate \(\partial _{y_j}L_{i,j}\), i.e., particularly the derivatives of the numerical flux functions defined in (15), (16). Since \(\omega _1^\pm +\omega _2^\pm =1\) for all \({\varvec{y}}(t)\), we deduce \(\partial _{y_k}\omega _1^\pm =-\partial _{y_k}\omega _2^\pm\). Introducing the notation

$$\begin{aligned} {\bar{f}}_j^{\pm }(y_{j-1},y_j,y_{j+1}) :=\mp \frac{1}{2}f^+(y_{j-1}) \pm f^+(y_{j}) \mp \frac{1}{2}f^+(y_{j+1}), \end{aligned}$$
(25)

we find

$$\begin{aligned} \begin{aligned} \partial _{y_{j-1}} {\hat{f}}^+_{j+1/2} =&\; \partial _{y_{j-1}} \omega _1^+ {\bar{f}}_j^+ - \frac{1}{2} (f^+)'(y_{j-1})w_1^+, \\ \partial _{y_{j}} {\hat{f}}^+_{j+1/2} =&\; \partial _{y_{j}} \omega _1^+ {\bar{f}}_j^+ + (f^+)'(y_j) \left( \frac{3}{2} w_1^+ + \frac{1}{2} w_2^+ \right) , \\ \partial _{y_{j+1}} {\hat{f}}^+_{j+1/2} =&\; \partial _{y_{j+1}} \omega _1^+ {\bar{f}}_j^+ + \frac{1}{2} (f^+)'(y_{j+1})w_2^+ , \end{aligned} \end{aligned}$$
(26)

and

$$\begin{aligned} \begin{aligned} \partial _{y_{j-1}} {\hat{f}}^-_{j-1/2} =&\; \partial _{y_{j-1}} \omega _1^- {\bar{f}}_j^- + \frac{1}{2} (f^-)'(y_{j-1})w_1^-, \\ \partial _{y_{j}} {\hat{f}}^-_{j-1/2} =&\; \partial _{y_{j}} \omega _1^- {\bar{f}}_j^- + (f^-)'(y_j) \left( \frac{1}{2} w_1^- + \frac{3}{2} w_2^- \right) , \\ \partial _{y_{j+1}} {\hat{f}}^-_{j-1/2} =&\; \partial _{y_{j+1}} \omega _1^- {\bar{f}}_j^- - \frac{1}{2} (f^-)'(y_{j+1})w_2^- . \end{aligned} \end{aligned}$$
(27)

We have the following three lemmata.

Lemma 3.1

Suppose \(f(y),\;y(t,\cdot )\in C^2({\mathbb {R}})\). Then

$$\begin{aligned} \partial _{y_k} \omega _1^\pm {\bar{f}}^{\pm }_j(y_{j-1},y_j,y_{j+1})= {\mathcal {O}}\left( \Delta x^3\right) ,\quad k=j-1,j,j+1. \end{aligned}$$
(28)

Proof

Taylor expansion gives \({\bar{f}}^{\pm }_j={\mathcal {O}}(\Delta x^2)\). It remains to show that \(\partial _{y_k} \omega _1^\pm ={\mathcal {O}}(\Delta x)\). Indeed, we have

$$\begin{aligned} \partial _{y_k} \omega _1^\pm = \frac{{\tilde{\omega }}_2^\pm }{({\tilde{\omega }}_1^\pm + {\tilde{\omega }}_2^\pm )^2} \,\partial _{y_k} {\tilde{\omega }}_1^\pm - \frac{{\tilde{\omega }}_1^\pm }{({\tilde{\omega }}_1^\pm + {\tilde{\omega }}_2^\pm )^2} \,\partial _{y_k} {\tilde{\omega }}_2^\pm \,. \end{aligned}$$
(29)

Since \(\beta _i^\pm ={\mathcal {O}}(\Delta x^2)\) for smooth flux functions f(y), the two quotients are bounded by \(({\tilde{\omega }}_i^\pm )^{-1}={\mathcal {O}}(\varepsilon ^2)\), \(i=2,1\), respectively, for \(\Delta x\rightarrow 0\). Taylor expansions of the derivatives \(\partial _{y_k} {\tilde{\omega }}_i^\pm =-2\gamma _i^\pm (\varepsilon +\beta _i^\pm )^{-3} \partial _{y_k}\beta _i^\pm\), \(i=1,2\), show \({\mathcal {O}}(\Delta x)\) for these terms and therefore also for \(\partial _{y_k} \omega _1^\pm\). \(\square\)

Lemma 3.2

Let \(\{x_{j-1},x_{j},x_{j+1}\}\) and \(\{x_{j},x_{j+1},x_{j+2}\}\) be two neighboring stencils and \(w_{i,j}^\pm\), \(w_{i,j+1}^\pm\), \(i=1,2,\) the corresponding weights. Suppose \(f(y),\;y(t,\cdot )\in C^3({\mathbb {R}})\). Then

$$\begin{aligned} w_{i,j+1}^\pm - w_{i,j}^\pm = {\mathcal {O}}\left( \Delta x^4\right) ,\quad i=1,2. \end{aligned}$$
(30)

Proof

We consider the weights \(w_{1,j}^+\) and \(w_{1,j+1}^+\). Analogous calculations can be done for the other cases. We set \(h(x):=f^+(y(x))\) and define \(h_j:=f^+(y(x_j))\). Then

$$\begin{aligned} \begin{aligned} w_{1,j+1}^+ - w_{1,j}^+ =&\; \frac{{\tilde{\omega }}_{1,j+1}^+}{{\tilde{\omega }}_{1,j+1}^+ + {\tilde{\omega }}_{2,j+1}^+} - \frac{{\tilde{\omega }}_{1,j}^+}{{\tilde{\omega }}_{1,j}^+ + {\tilde{\omega }}_{2,j}^+}\\ =&\; \frac{ \frac{{\tilde{\omega }}_{2,j}^+}{{\tilde{\omega }}_{1,j}^+}- \frac{{\tilde{\omega }}_{2,j+1}^+}{{\tilde{\omega }}_{1,j+1}^+} }{ \left( 1+ \frac{{\tilde{\omega }}_{2,j}^+}{{\tilde{\omega }}_{1,j}^+}\right) \, \left( 1+ \frac{{\tilde{\omega }}_{2,j+1}^+}{{\tilde{\omega }}_{1,j+1}^+}\right) }\,. \end{aligned} \end{aligned}$$
(31)

Due to the strict positivity of the weights, it remains to study the asymptotic behaviour of the numerator. Using the definitions (17) and (18), we have

$$\begin{aligned} D_{{\tilde{w}}}:=\frac{{\tilde{\omega }}_{2,j}^+}{{\tilde{\omega }}_{1,j}^+}- \frac{{\tilde{\omega }}_{2,j+1}^+}{{\tilde{\omega }}_{1,j+1}^+} = \frac{\gamma _2^+}{\gamma _1^+}\, \frac{(\varepsilon + \beta _{1,j}^+)^2(\varepsilon + \beta _{2,j+1}^+)^2 - (\varepsilon + \beta _{1,j+1}^+)^2(\varepsilon + \beta _{2,j}^+)^2}{(\varepsilon + \beta _{2,j}^+)^2(\varepsilon + \beta _{2,j+1}^+)^2} \end{aligned}$$
(32)

with the smoothness indicators

$$\begin{aligned} \begin{aligned} \beta _{1,j}^+ =&\; (h_{j}-h_{j-1})^2,\quad \beta _{2,j}^+ = (h_{j+1}-h_{j})^2\,,\\ \beta _{1,j+1}^+ =&\; (h_{j+1}-h_{j})^2,\quad \beta _{2,j+1}^+ = (h_{j+2}-h_{j+1})^2\,. \end{aligned} \end{aligned}$$
(33)

Taylor expansion at \(x_j\) yields in (32)

$$\begin{aligned} D_{{\tilde{w}}} = \frac{\gamma _2^+}{\varepsilon ^4\gamma _1^+} \left( 4\varepsilon ^3 \Delta x^4 \left( (h''_j)^2+h'_jh'''_j\right) + {\mathcal {O}}(\varepsilon ^3 \Delta x^5) \right) \,, \end{aligned}$$
(34)

which shows the assertion. \(\square\)

Lemma 3.3

Suppose \(f(y),\;y(t,\cdot )\in C^2({\mathbb {R}})\). Then

$$\begin{aligned} \omega _1^\pm - \gamma _1^\pm = {\mathcal {O}}\left( \Delta x^3\right) \,. \end{aligned}$$
(35)

Proof

We first consider \(\omega _1^+ - \gamma _1^+\). The difference can be expressed by

$$\begin{aligned} \omega _1^+ - \gamma _1^+ = \frac{{\tilde{\omega }}_1^+-\gamma _1^+ ({\tilde{\omega }}_1^+ + {\tilde{\omega }}_2^+)}{{\tilde{\omega }}_1^+ + {\tilde{\omega }}_2^+}. \end{aligned}$$
(36)

The denominator is bounded from below by \((\gamma _1^+ + \gamma _2^+)\varepsilon ^{-2}=\varepsilon ^{-2}>0\). Further, we deduce for the numerator

$$\begin{aligned} N_{{\tilde{\omega }}}:={\tilde{\omega }}_1^+-\gamma _1^+ ({\tilde{\omega }}_1^+ + {\tilde{\omega }}_2^+) = \gamma _1^+\gamma _2^+\;\frac{(\varepsilon +\beta _2^+)^2-(\varepsilon +\beta _1^+)^2}{(\varepsilon +\beta _1^+)^2\,(\varepsilon +\beta _2^+)^2}\,. \end{aligned}$$
(37)

Let \(h(x):=f^+(y(x))\) and define \(h_j:=f^+(y(x_j))\). Inserting the smoothness indicators \(\beta _1^+=(h_j-h_{j-1})^2\) and \(\beta _2^+=(h_{j+1}-h_{j})^2\), Taylor expansion at \(x_j\) yields

$$\begin{aligned} N_{{\tilde{\omega }}} = \frac{\gamma _1^+\gamma _2^+}{\varepsilon ^4} \left( 4\varepsilon \Delta x^3 h'_jh''_j + {\mathcal {O}}\left( \Delta x^5\right) \right) \,. \end{aligned}$$
(38)

Putting this together with the bound for the denominator stated above gives \(\omega _1^+ - \gamma _1^+ = {\mathcal {O}}(\Delta x^3/\varepsilon )\), from which we can conclude the proof. The same arguments apply to the second difference \(\omega _1^- - \gamma _1^-\). \(\square\)

We are now ready to state the main result of this section.

Theorem 3.1

Let \(f(y),\;y(t,\cdot )\in C^3({\mathbb {R}})\) and \(p(t,\cdot )\in C^4({\mathbb {R}})\). Then the adjoint WENO3 scheme (20) is adjoint consistent of order three, i.e., \(r_j(t)={\mathcal {O}}(\Delta x^3)\) in (24).

Proof

Let us define \(d_k:=\sum _{i=-2,\ldots ,2}i^{k+1}\partial _{y_j}L_{i,j}({\varvec{y}}(t))\), \(k=0,1,2,\) and denote by \(w_{i,m}^\pm\) the weights that correspond to the stencils \(\{x_{m-1},x_{m},x_{m+1}\}\), \(m=j-1,j,j+1\). From (22), we calculate

$$\begin{aligned} \begin{aligned} d_0&\; = -\partial _{y_j} \left( 2L_{-2,j}({\varvec{y}}(t))+ L_{-1,j}({\varvec{y}}(t))-L_{1,j}({\varvec{y}}(t))- 2L_{2,j}({\varvec{y}}(t))\right) \\&\; = -\partial _{y_j}\left( {\hat{f}}^-_{j-3/2} + {\hat{f}}^+_{j-1/2} + {\hat{f}}^-_{j-1/2} + {\hat{f}}^+_{j+1/2} + {\hat{f}}^-_{j+1/2} + {\hat{f}}^+_{j+3/2}\right) , \end{aligned} \end{aligned}$$
(39)

which gives by using (26), (27) for different stencils and Lemma 3.1 for all terms \(\partial _{y_j} \omega _1^\pm {\bar{f}}^{\pm }_m\) with \(m=j-1,j,j+1\),

$$\begin{aligned} \begin{aligned} d_0 =&\; -\left( \frac{1}{2}w_{1,j}^{-}+\frac{1}{2}w_{1,j+1}^{-}-\frac{1}{2}w_{2,j-1}^{-}+\frac{3}{2}w_{2,j}^{-} \right) (f^-)'(y_j) \\&\; -\left( \frac{3}{2}w_{1,j}^{+}-\frac{1}{2}w_{1,j+1}^{+}+\frac{1}{2}w_{2,j-1}^{+}+\frac{1}{2}w_{2,j}^{+} \right) (f^+)'(y_j) + {\mathcal {O}}(\Delta x^3). \end{aligned} \end{aligned}$$
(40)

Eventually, Lemma 3.2 and the property \(w_{1,j}^\pm +w_{2,j}^\pm =1\) yields

$$\begin{aligned} d_0 = - \left( (f^-)'(y_j) + (f^+)'(y_j) \right) + {\mathcal {O}}(\Delta x^3) = -f'(y_j) + {\mathcal {O}}(\Delta x^3). \end{aligned}$$
(41)

Analogously, we derive

$$\begin{aligned} \begin{aligned} d_1 =&\; \left( \frac{1}{2}w_{1,j}^{-}-\frac{1}{2}w_{1,j+1}^{-}-\frac{3}{2}w_{2,j-1}^{-}+\frac{3}{2}w_{2,j}^{-} \right) (f^-)'(y_j) \\&\; + \left( -\frac{3}{2}w_{1,j}^{+}+\frac{3}{2}w_{1,j+1}^{+}+\frac{1}{2}w_{2,j-1}^{+}-\frac{1}{2}w_{2,j}^{+} \right) (f^+)'(y_j) + {\mathcal {O}}(\Delta x^3) \end{aligned} \end{aligned}$$
(42)

and

$$\begin{aligned} \begin{aligned} d_2 =&\; \left( -\frac{1}{2}w_{1,j}^{-}-\frac{1}{2}w_{1,j+1}^{-}+\frac{7}{2}w_{2,j-1}^{-}-\frac{3}{2}w_{2,j}^{-} \right) (f^-)'(y_j) \\&\; + \left( -\frac{3}{2}w_{1,j}^{+}+\frac{7}{2}w_{1,j+1}^{+}-\frac{1}{2}w_{2,j-1}^{+}-\frac{1}{2}w_{2,j}^{+} \right) (f^+)'(y_j) + {\mathcal {O}}(\Delta x^3)\,. \end{aligned} \end{aligned}$$
(43)

Lemma 3.2 directly shows that \(d_1={\mathcal {O}}(\Delta x^3)\). Using \(w_1^\pm +w_2^\pm =1\) and again Lemma 3.2, the linear combinations of the weights in \(d_2\) can be simplified to \(2-3w_{1,j}^-\) and \(3w_{1,j}^+-1\) up to order \({\mathcal {O}}(\Delta x^4)\), respectively. Applying now Lemma 3.3 with \(\gamma _1^+=1/3\) and \(\gamma _1^-=2/3\) to these expressions gives \(d_2={\mathcal {O}}(\Delta x^3)\).

In a last step, we use the asymptotic expressions for \(d_i\), \(i=0,1,2,\) to calculate the residual-type local spatial error

$$\begin{aligned} \begin{aligned} r_j(t) =&\; \partial _tp(t,x_j)-\sum _{k=0}^2\Delta x^k\frac{1}{(k+1)!} d_k\,\partial _{x}^{k+1}p(t,x_j) + {\mathcal {O}}\left( \Delta x^3\right) \\ =&\; \partial _tp(t,x_j) + f'(y(t,x_j))\,\partial _xp(t,x_j) + {\mathcal {O}}\left( \Delta x^3\right) = {\mathcal {O}}\left( \Delta x^3\right) \,. \end{aligned} \end{aligned}$$
(44)

This concludes the proof. \(\square\)

4 Numerical experiments

In this section, we will present some numerical examples for Burgers equation, i.e., we study problems with the nonlinear flux function \(f(y)=\frac{1}{2} y^2\) in (4). The first example with smooth initial data and solution is chosen in order to check to third-order convergence of the discrete adjoint WENO3 method as stated in Theorem 3.1. In the second example, the approximation property of the discrete adjoint in the case of a shock in the initial solution is investigated and compared to approximations computed by means of the first-order modified Lax–Friedrichs (LF) and Engquist–Osher (EO) schemes. These schemes read

$$\begin{aligned} \begin{aligned} y_j^0 =&\; u_0(x_j)\,,\\ y_j^{n+1} =&\; y_j^n - \frac{\Delta x}{\Delta t} \left( {\hat{f}}(y_j^n,y_{j+1}^n) - {\hat{f}}(y_{j-1}^n,y_{j}^n)\right) ,\quad n=0,\ldots ,n_T-1, \end{aligned} \end{aligned}$$
(45)

with \(y_j^n\approx y(n\Delta t,x_j)\), \(n_T\,\Delta t=T\), and numerical fluxes given by

$$\begin{aligned} \begin{aligned} {\hat{f}}_{LF}(a,b) =&\; \frac{1}{2} \left( f(b)+f(a)\right) - \frac{\gamma }{2}\frac{\Delta x}{\Delta t} \left( b-a \right) , \quad \gamma \in (0,1)\,,\\ {\hat{f}}_{EO}(a,b) =&\; f(0) + \int _{0}^{a}\max (0,f'(s))\,ds + \int _{0}^{b}\min (0,f'(s))\,ds\,. \end{aligned} \end{aligned}$$
(46)

Applying a standard Lagrangian approach and discrete adjoint calculus, the discrete adjoint schemes can be derived from [16, Prep. 3.1] as

$$\begin{aligned} \begin{aligned} p_j^{n_T} =&\; \partial _yG(y_j^{n_T},y_d(x_j))\,,\\ p_j^n =&\; c_{j-1}p_{j-1}^{n+1} + c_{j}p_{j}^{n+1} + c_{j+1}p_{j+1}^{n+1},\quad n=n_T-1,\ldots ,0, \end{aligned} \end{aligned}$$
(47)

with the coefficients

$$\begin{aligned} c_{j-1}=\frac{\gamma }{2} - \frac{\Delta t}{2\Delta x}f'(y_j^{n+1}),\quad c_j=1-\gamma ,\quad c_{j+1}=\frac{\gamma }{2} + \frac{\Delta t}{2\Delta x}f'(y_j^{n+1}), \end{aligned}$$
(48)

for the LF scheme and

$$\begin{aligned} \begin{aligned} c_{j-1} =&\; \frac{\Delta t}{2\Delta x} \left( |f'(y_j^{n+1})| - f'(y_j^{n+1}) \right) ,\quad c_j=1-\frac{\Delta t}{\Delta x} |f'(y_j^{n+1})|,\\ c_{j+1} =&\; \frac{\Delta t}{2\Delta x} \left( |f'(y_j^{n+1})| + f'(y_j^{n+1}) \right) , \end{aligned} \end{aligned}$$
(49)

for the EO scheme. Convergence of these schemes has been intensively studied in [1, 10,11,12, 29]. The choice \(\gamma \!=\!1\) leads to the classical LF method. Stability requirements for the adjoint LF and EO schemes yield the optimal value \(\gamma ^\star \!=\!0.5\) together with the CFL-condition \(\Delta t \le \gamma ^\star \Delta x/\sup |f'(y)|\), see e.g. [16]. Then, both schemes converge for Lipschitz continuous end data \(p^T(x)\) in (8). The stronger condition \(\Delta t \le \gamma ^\star (\Delta x)^{2-q}/\sup |f'(y)|\), \(0<q<1\), ensures the convergence of the modified LF scheme for discontinuous end data, too [11, 12]. Convergence for slightly modified end data and less numerical viscosity has been recently studied in [1].

In order to get a fully discrete scheme for WENO3, the differential equation (12) is numerically solved by the three-stage third-order strong-stability-preserving Runge-Kutta method SSPRK3, which offers good stability properties [13, 14, 16, 19]. In the Shu-Osher representation, it reads

$$\begin{aligned} \begin{aligned} {\varvec{y}}_0^n =&\; {\varvec{y}}^n\,,\\ {\varvec{y}}_1^n =&\; {\varvec{y}}_0^n - \Delta t F_{\Delta x}({\varvec{y}}_0^n)\,,\\ {\varvec{y}}_2^n =&\; \frac{3}{4} {\varvec{y}}_0^n + \frac{1}{4} {\varvec{y}}_1^n - \frac{1}{4} \Delta t F_{\Delta x}({\varvec{y}}_1^n)\,,\\ {\varvec{y}}^{n+1} =&\; \frac{1}{4} {\varvec{y}}_0^n + \frac{2}{3} {\varvec{y}}_2^n - \frac{2}{3} \Delta t F_{\Delta x}({\varvec{y}}_2^n)\,,\quad n=0,\ldots ,n_T-1. \end{aligned} \end{aligned}$$
(50)

The corresponding adjoint time discretization has the form (see e.g. [16])

$$\begin{aligned} \begin{aligned} {\varvec{p}}_0^n =&\; {\varvec{p}}^{n+1}\,,\\ {\varvec{p}}_1^n =&\; \frac{2}{3} {\varvec{p}}_0^n - \frac{2}{3} \Delta t \,\nabla _{{\varvec{y}}} F_{\Delta x}({\varvec{y}}_2^n)^T\,{\varvec{p}}_0^n\,,\\ {\varvec{p}}_2^n =&\; \frac{1}{4} {\varvec{p}}_1^n - \frac{1}{4} \Delta t \,\nabla _{{\varvec{y}}} F_{\Delta x}({\varvec{y}}_1^n)^T\,{\varvec{p}}_1^n\,,\\ {\varvec{p}}^n =&\; \frac{1}{3} {\varvec{p}}_0^n + \frac{3}{4} {\varvec{p}}_1^n + {\varvec{p}}_2^n - \Delta t \,\nabla _{{\varvec{y}}} F_{\Delta x}({\varvec{y}}_0^n)^T\,{\varvec{p}}_2^n\,,\quad n=n_T-1,\ldots ,0. \end{aligned} \end{aligned}$$
(51)

We note that the adjoint scheme has only order two, which is the upper barrier for three-stage third-order SSPRK methods [16].

In the final experiment, we solve an optimal control problem with a discontinuous target, proposed in [16]. The discrete adjoint \({\varvec{p}}^0\) provides gradient information, which can be directly used to set up the following algorithm:

  1. 0.

    Given a control \({\varvec{u}}_0:={\varvec{u}}^{(j)}\) at iteration j.

  2. 1.

    Compute the discrete adjoint \({\varvec{p}}^0({\varvec{u}}^{(j)})\) and update \({\varvec{u}}^{(j+1)}={\varvec{u}}^{(j)}-\alpha _j{\varvec{p}}^0({\varvec{u}}^{(j)})\) with \(\alpha _j\) such that Armijo’s condition

    $$\begin{aligned} J({\varvec{y}}^{n_T}({\varvec{u}}^{(j+1)}),{\varvec{y}}_d) \le J({\varvec{y}}^{n_T}({\varvec{u}}^{(j)}),{\varvec{y}}_d) - \frac{1}{2}\alpha _j \Vert {\varvec{p}}^0({\varvec{u}}^{(j)}) \Vert ^2_{L^2(I)} \end{aligned}$$

    is fulfilled. If it is not satisfied, choose \(\alpha _j:=0.95\,\alpha _j\) and check the condition again.

  3. 2.

    Stop if  \(|J({\varvec{y}}^{n_T}({\varvec{u}}^{(j+1)}),{\varvec{y}}_d)- J({\varvec{y}}^{n_T}({\varvec{u}}^{(j)}),{\varvec{y}}_d)|\le tol\). Otherwise set \(j:=j+1\) and proceed with step 1.

In general, taking the adjoint as a decent direction may increase the complexity of the optimization process due to the production of additional discontinuities [5, 23, 24]. A careful choice of the initial guess \({\varvec{u}}_0\) can remedy this serious problem. We follow the approach proposed in [17] and first solve the conservation law

$$\begin{aligned} \begin{aligned} \partial _t z + \partial _x f(z) =&\; 0,\quad (t,x)\in \Omega _T,\\ z(0,x) =&\;y_d(-x), \end{aligned} \end{aligned}$$
(52)

where \(y_d\) is the target given in (1). The initial guess is then chosen as \(u_0=z(T,-x)\). Formally, as pointed out in [17], (52) is obtained by reverting t and x in (4) and taking \(y_d\) as initial condition. The advantage of this approach is that it delivers a control whose entropy solution is close to the target and the location of the discontinuities almost coincide. Hence, the production of additional discontinuities within each iteration step is avoided, which improves the performance of the algorithm drastically. We will exemplify the influence of the choice of the initial guess in our optimal control problem.

4.1 Order test for the discrete adjoint for smooth data

This section is devoted to numerically verify the third-order convergence of the adjoint WENO3 scheme. For this purpose, we choose the computational domain \(\Omega _T=(0,0.5]\times [-1.5,1.5]\) and the objective functional

$$\begin{aligned} J(y(0.5,\cdot ;u_0),0) = \frac{1}{2}\,\int _{-\frac{3}{2}}^{\frac{3}{2}} y(0.5,x;u_0)^2\,dx \end{aligned}$$
(53)

with the smooth initial data

$$\begin{aligned} u_0(x)={\left\{ \begin{array}{ll} e^{-\frac{1}{1-x^2}} &{} , |x| < 1, \\ 0 &{} , |x| \ge 1. \end{array}\right. } \end{aligned}$$
(54)

The exact solution y(tx) can be directly computed from the method of characteristics, i.e., \(y(t,x)=u_0(x_0(x(t),t))\) with \(x_0(x(t),t)\) being the solution of the nonlinear equation \(x(t)=x_0+u_0(x_0)\,t\). A reference solution \(y_T\approx y(0.5,x)\) at the final time is computed by Newton’s method with a high tolerance \(10^{-14}\).

Since shocks are not present, we find \(p^T(x)=y(0.5,x)\) in (8). We also note that the characteristics curves of the adjoint problem coincide with the characteristic curves of the forward problem. Thus, the corresponding reversible solution p(0, x) at time \(t\!=\!0\) is given by \(u_0(x)\), which serves as reference solution for the adjoint.

We use a sequence of spatial meshes with a number of grid points \(N=150\cdot 2^i,\;i=0,\ldots ,6,\) and set \(\Delta t=0.5\,\Delta x\). In order to keep the temporal error below \({\mathcal {O}}((\Delta x)^3)\), we apply the classical fourth-order four-stage explicit Runge-Kutta method (ERK4). Its adjoint time discretization has also order four [15] for smooth solutions and therefore the overall scheme is suitable to check the order three of the adjoint WENO3 method. We also present results for the forward WENO3 method to document the error of the approximated starting value \({\varvec{p}}^{n_T}={\varvec{y}}^{n_T}\). The \(L^\infty\)-errors collected in Table 1 clearly show asymptotic order three of the spatial WENO3 discretization for both forward and adjoint numerical solution.

Table 1 Burgers problem with smooth initial data and smooth solution: \(L^\infty\)-error of the forward solution \(\Vert {\varvec{y}}_T-{\varvec{y}}^{n_T}\Vert _\infty\) at the final time \(T=0.5\) and adjoint solution \(\Vert {\varvec{u}}_0-{\varvec{p}}^0\Vert _\infty\) at time \(t=0\) for a sequence of spatial meshes with \(N=150,300,\ldots ,9600\) grid points

4.2 Approximation of the discrete adjoint in the case of shocks

We now consider discontinuous solutions with shocks. Our test case is taken from Example 2.1 with computational domain \(\Omega _T=(0,0.5]\times [-1,1]\). The reversible solution p(0, x) at \(t\!=\!0\) is given by (11). We apply the above described forward and adjoint LF, EO, and WENO3 schemes with \(\Delta x=0.01,\;0.002\), and \(\Delta t=0.25\,\Delta x\). The corresponding numerical approximations \({\varvec{p}}^0\) are shown in Fig. 2.

Fig. 2
figure 2

Burgers problem with discontinuous initial and final solution taken from Example 2.1. Numerical approximations \({\varvec{p}}^0\) to the reversible solution \(p_0:=p(0,x)\) given in (11) for the adjoint Lax–Friedrichs (LF), Engquist–Osher (EO) and WENO3 scheme applied with \(\Delta x=0.01\) (left), \(\Delta x=0.002\) (right), and \(\Delta t=0.25\,\Delta x\)

The first-order LF and EO schemes smear out the discontinuities, but deliver \(L^\infty\)-stable approximations and thus respect the analytical property of the adjoint. In the spirit of WENO schemes, the adjoint WENO3 delivers a quite sharp resolution of the shocks at the price of bounded over- and undershoots of around \(5\%\). In Table 2, we plot the \(L^\infty\)-error in the shock funnel for \(x\in [-0.3,0.3]\). All schemes converge quite rapidly. Note that convergence in the shock funnel is not always achieved since the interior boundary condition at shock positions as given in (9) does not appear on the discrete level, see the discussions in [1, 11, 12].

Table 2 Burgers problem with discontinuous initial and final solution taken from Example 2.1

4.3 Optimal control problem with discontinuous target

We consider the optimal control problem (1) with the objective functional [16]

$$\begin{aligned} J(y(0.5,\cdot ;u_0),y_d(x)) = \frac{1}{2}\,\int _{-1}^{1} |y(0.5,x;u_0)-y_d(x)|^2\,dx \end{aligned}$$
(55)

and the discontinuous target \(y_d\) defined by

$$\begin{aligned} y_d(x)={\left\{ \begin{array}{ll} 2x-\frac{1}{2} &{} , \frac{1}{4}\le x\le \frac{3}{4}, \\ 0 &{} , \text {otherwise}. \end{array}\right. } \end{aligned}$$
(56)

The optimal control \(u_0^\star\), which serves as a reference solution, is

$$\begin{aligned} u_0^\star (x)={\left\{ \begin{array}{ll} -2x+\frac{3}{2} &{} , \frac{1}{4}\le x\le \frac{3}{4}, \\ 0 &{} , \text {otherwise}. \end{array}\right. } \end{aligned}$$
(57)

We will present results for two mesh sizes \(\Delta x=0.005,\;0.002\), and time steps \(\Delta t=0.25\,\Delta x\). The initial guess for the control is computed from (52) with the individual method under consideration. For WENO3 and the coarser mesh size, it is shown in Fig. 3 together with the corresponding state solution.

In Fig. 4, the results of the gradient based optimization procedure described above for tolerances \(tol_1=10^{-5}\), \(tol_2=10^{-7}\), and mesh size \(\Delta x=0.005\) for the adjoint WENO3 method are plotted. The value of the objective functional decreases from \(4.75*10^{-4}\) for \(tol_1\) to \(3.18*10^{-4}\) for \(tol_2\), resulting in a better approximation of the target \(y_d\). We can conclude that the adjoint WENO3 method allows to recover the initial data together with the final state solution adequately. The shock of the target is sharply resolved and the rarefaction of the initial data is also recovered. In order to compare these results with those obtained from the LF and EO schemes, we perform 50 iterations of the optimization algorithm for both mesh sizes. The calculated optimal controls and their corresponding final state solutions are collected in Fig. 5. The adjoint WENO3 method resolves the shock sharply. In contrast, the LF method is too diffusive and only provides an unsatisfactory shock resolution. The numerical artifacts around the shocks are huge. The optimized final state solution obtained by the EO scheme possesses very small numerical artifacts, but the shock is less sharply resolved and the spike of the target is slightly smeared out. In Table 3, we depict the iteration history for all runs of the optimization. In every case, the LF method performs poorer than the others. In terms of a low cost functional, the adjoint WENO3 method performs best. We also see the influence of the initial guess on the performance of the algorithm. This is due to the fact that the use of \({\varvec{u}}_0=0\) as starting control value produces artificial discontinuities within each iteration step.

Fig. 3
figure 3

Optimal control problem. Initial control \({\varvec{u}}_0\) and optimal control \(u_0^\star\) (left), initial state solution \({\varvec{y}}^{n_T}\) at \(T=0.5\) and target \(y_d\) (right), computed with the WENO3 scheme and mesh size \(\Delta x=0.005\)

Fig. 4
figure 4

Optimal control problem. Optimal control \(u_{0}^\star\) and target \(y_d\), numerically computed optimal control \({\varvec{u}}_0\) and corresponding state solution \({\varvec{y}}^{n_T}\) for tolerances \(tol_1 = 10^{-5}\) (left) and \(tol_2= 10^{-7}\) (right) using WENO3 with mesh size \(\Delta x=0.005\)

Fig. 5
figure 5

Optimal control problem. Computed optimal control functions \({\varvec{u}}_0\) (top) and corresponding state solution \({\varvec{y}}^{n_T}\) (above the middle) with a zoom into the shock region (below the middle, bottom) for 50 iterations of the gradient based optimization algorithm, using LF, EO, and WENO3 scheme with mesh size \(\Delta x=0.005\) (left) and \(\Delta x=0.002\) (right)

Table 3 Optimal control problem

5 Summary

We have developed a novel adjoint WENO3 scheme to provide approximations of the gradient for optimal control problems governed by hyperbolic conservation laws and proved third-order consistency in space for sufficiently smooth solutions. The adjoint WENO3 method is able to sharply resolve discontinuities of reversible solutions. For an exemplary optimal control problem with discontinuous target, the method works very well and outperforms common first-order methods as the Lax–Friedrichs and Engquist–Osher schemes.