1 Introduction

Liouville equations arise in many different areas of science as biology, finances, mechanics, and physics. These equations govern the evolution of density functions that may represent the probability density of multiple trials of a single evolving system or the physical density of multiple non-interacting systems.

Although the Liouville equation in space coordinates and its counterpart in phase-space are central in classical continuous mechanics, we notice that control problems governed by these equations have not been a research focus. However, we refer to [7, 8] for a discussion on the advantages of the Liouville framework. In fact, with the Liouville model, we have the possibility to lift optimal control problems with ordinary differential equations to the realm of control problems governed by partial differential equations (PDEs), thus extending the focus from a single trajectory to an ensemble of trajectories, which also allows to model systems with uncertainty in the initial data. With this change of perspective, the investigation of robust control strategies and feedback mechanisms appears more natural and may lead to new successful results [27]. We remark that the analysis of the Liouville equation is an important topic in the modern theory of PDEs; see, e.g., [1, 2, 10, 14].

On the other hand, much less is known on the theory of Liouville optimal control problems. Less recent contribution, concerning necessary optimality conditions for smooth cost functionals and with smooth initial density \(\rho _0\), can be found in [22, 25]. Further results are presented in [7, 8] and in [20]. Notice that the Liouville equation is related to the transport equation [2], and for control problems governed by the latter equation and with smooth objectives including control costs, we refer to, e.g., [6, 11].

Very recently, Pogodaev [24] proposed a challenging Liouville optimal control problem where the controller has the purpose to transport an initial probability measure to maximize the measure of a target set at a given final time. This setting accommodates multi-agent control problems as well as the problem of the control of a beam of charged particles, and it can be considered an approximation to the classical mass transportation problem [3, 31].

The purpose of our work is to develop and investigate a numerical framework for solving the Liouville control problem proposed in [24]. In doing this, we address two fundamental issues in numerical analysis of PDEs and optimization. On the one hand, we analyse a high-order conservative and positive preserving discretization scheme for continuity-type equations. On the other hand, we develop a new methodology for the fast solution of PDE control problems that are formulated in the framework of the Pontryagin’s maximum principle (PMP).

We remark that the development of guaranteed high-order positive preserving schemes for general convective PDEs (continuity equations, transport equations, convection-diffusion equations, Fokker–Planck equations, Boltzmann equations, etc.) is an open problem in the field of numerical analysis with only few results valid for special cases; see [32, 33] and the discussion therein. Furthermore, see [12] concerning numerical methods for high-dimensional simulation problems governed by density function equations and applications. However, positivity of the numerical scheme is essential when solving equations that model evolution of densities, which cannot be negative. This is also essential in solving control problems with these models, where the action of the control function may be such to force a negative solution for the density, if the objective supports this development and the discretization scheme does allow this possibility. On the other hand, it is always desirable to have high-order accurate schemes. Moreover, conservation of total mass (or probability) is essential in the simulation and control of the evolution of densities.

In this paper, we focus on a third-order accurate positive scheme by Sanders [28], which we apply to solve the Liouville equation. We outline that Sanders’ scheme is TVD and preserves higher-order accuracy even at smooth extrema of the solution, therefore representing an exception for the theory of first-order degeneracy at smooth extrema given in [21]. We prove that this scheme is conservative and overall second-order convergent in the \(L^1\) norm. Notice that this latter strong result is new in the scientific literature, even considering finite-volume (FV) schemes for linear conservation laws.

The second main focus of our work is an efficient implementation of the Pontryagin’s maximum principle [15, 17] in the framework of PDE control problems. We remark that, compared to the large literature on the PMP for control problems governed by ordinary differential equations (ODEs), much less is known on this topic in the PDE framework; see, e.g., [5, 9, 18, 23, 26, 29, 30]. It is well known that the numerical implementation of the PMP is already challenging in the case of ODE systems. In this case, one can find direct and indirect methods: the former methods transform the ODE control problem in a discrete nonlinear programming problem, while the latter methods interpret the control problem as a boundary-value problem that is solved by shooting schemes. However, in the realm of PDE control problems these methodologies may pose severe limitation on the numerical size (and accuracy) of the problem and may show a lack of robustness.

Our purpose is to develop a novel iterative scheme that implements the PMP characterization of optimality and does not require any differentiation of the Hamiltonian with respect to the control function. Specifically, our approach implements a control update that is consistent with the PMP and the numerical approximation of the Liouville equation and of its adjoint, thus exploiting the functional structure of the governing equation. We remark that our optimization strategy could be applied to different PDE models.

In the next section, we introduce a controlled ODE model and the corresponding Liouville PDE problem. One can notice that the control function in the ODE model appears in the controlled Liouville equation as a non-linear control in the coefficients. Based on [24], we consider the Liouville equation for the time-evolution of densities and define a control-constrained optimal control problem in this setting. The purpose of the control is to maximize the distribution of density on a target set at the final time. In Sect. 3, we discuss our Liouville control problem and its PMP characterization as in [24]. Further, we introduce the Lagrange formalism for our control problem to provide a ‘Lagrange’ interpretation of the given PMP characterization and to derive a reduced gradient that is used in a gradient-based scheme for a comparison to our PMP iterative scheme. In Sect. 4, we investigate the third-order accurate positive TVD scheme proposed by Sanders [28] and prove that this scheme is conservative and second-order accurate in the \(L^1\)-norm. In Sect. 5, we discuss our numerical optimization scheme to solve the Liouville control problem in the PMP framework. This is an iterative procedure that involves solving, pointwise in time, the PMP optimality condition formulated in such a way to include the local discrete control-to-state map. In Sect. 6, we present results of numerical experiments to validate the accuracy of the discretization scheme and the effectiveness of our PMP optimization procedure. A section of conclusions completes this work.

2 A Liouville Optimal Control Problem

Consider a differential control model given by the following ordinary differential equation with given initial condition

$$\begin{aligned} \dot{x}=b(x,t,u), \qquad x(0)=x_0. \end{aligned}$$
(1)

This model and a functional objective are the main components of many optimal control problems where it is required to find a control law for (1) such that a given optimality criterion is achieved.

As pointed out in [7], a robust control formulation related to (1) is to consider the corresponding Liouville equation given by (in a one-dimensional setting)

$$\begin{aligned} \frac{\partial }{\partial t} \rho (x,t) + \frac{\partial }{\partial x} \left( b(x,t,u) \, \rho (x,t)\right) =0, \qquad \rho (x,0)=\rho _0(x) . \end{aligned}$$
(2)

This equation describes the ensemble of trajectories of (1) for a density of initial conditions \(\rho _0\) corresponding to a unique control function.

The Liouville model can be interpreted as the result of multiple trials involving a single system or as the evolution of many identical non-interacting copies of a single system.

In the case of the former interpretation, the function \(\rho =\rho (x,t)\) represents the probability density function (PDF) of finding the system in x at time t assuming that \(\rho _0\) prescribes the initial probability density for \(x_0\). For this reason, we require \(\rho _0\ge 0\), \(\int _{\mathbb R}\rho _0(x)dx=1\) and consequently

$$\begin{aligned} \rho (x,t) \ge 0, \qquad \int _{\mathbb R}\rho (x,t)dx=1, \text{ for } t \ge 0. \end{aligned}$$

It is clear that solving an open-loop optimal control problem governed by (1) with u as a function of time gives, by construction, a control that cannot capture any uncertainty in the initial condition and any disturbances along the evolution. On the other hand, if a control is sought using (2) as the governing equation, a robust control is obtained for the ensemble of all trajectories having \(\rho _0\) as the initial distribution. Further, notice that the framework given by (2) suggests to consider a larger class of controls depending on time and on the coordinate x, i.e. \(u=u(x,t)\). Now, since x represents the state of the original ODE system, this type of controls appears predestined to become a feed-back control function.

In the case that (2) models the evolution of many identical non-interacting ‘particles’, the Liouville model represents the starting point for the differential formulation of optimal transportation problems [31]. In fact, in [3] optimal transportation is formulated as a minimization problem governed by the Liouville equation with the objective given by the ‘energy’ of the control in (4) with \(b=u(x)\) and given initial and final density functions.

Notice that this is a much stronger requirement than the problem of transporting a given mass initially distributed on a compact support to another place denoted by a bounded target set. We consider this latter problem and assume that the control is a time-dependent function. Thus a control \(u=u(t)\) is sought that applies to all trajectories in the ensemble having \(\rho _0\) as the initial distribution.

This problem can be formulated as follows

$$\begin{aligned} \begin{aligned} \max&\, J(\rho ,u):=\int _\mathcal{B}\rho (x,T)~ dx ,\\ \end{aligned} \end{aligned}$$
(3)

such that

$$\begin{aligned} \begin{aligned}&\rho _t + \nabla \cdot (b(x,t,u) \, \rho ) = 0,\\&\rho (x,0) = \rho _0 , \\ \end{aligned} \end{aligned}$$
(4)

where \(x\in {\mathbb R}^n\) and \(b: {\mathbb R}^n \times {\mathbb R}\times U \rightarrow {\mathbb R}^n\) and we use a compact notation with \(\rho _t = \frac{\partial \rho }{\partial t}\), \(\nabla \) is the Cartesian gradient vector with respect to x and thus \(\nabla \cdot \) denotes divergence. Further, we choose an initial density \(\rho _0\) that is non-negative, normalized to one, and with compact support in \(\mathcal{A}\subset {\mathbb R}^n\). We also assume that \(U\subset {\mathbb R}^m\), \(m \le n\), and \(\mathcal{B}\subset {\mathbb R}^n\) are compact and nonempty.

We remark that, in the present framework and up to normalization, the problems of determining the evolution of a probability density and the transport of mass are equivalent.

It is the purpose of this paper to investigate (34) from a numerical perspective, within the theoretical framework given in [24]. In this reference, the Liouville equation for measures instead of that for functions is considered, and the focus is on the following optimal control problem

$$\begin{aligned} \begin{aligned} \max&\,\mu (T)(\mathcal{B}) ,\\ \end{aligned} \end{aligned}$$
(5)

such that

$$\begin{aligned} \begin{aligned}&\mu _t + \nabla \cdot (b(x,t,u)\, \mu ) = 0,\\&\mu (0) = \mu _0 , \\ \end{aligned} \end{aligned}$$
(6)

where a distributional solution to the forward problem (6), denoted with \(\mu \in C([0,T];\mathcal{P}({\mathbb R}^n))\), is defined as the solution to the following

$$\begin{aligned} \int _0^T \int _{{\mathbb R}^n} (\phi _t + b \cdot \nabla \phi )d\mu (t)dt=0, \end{aligned}$$

for all smooth and compactly supported functions \(\phi :{\mathbb R}^n \times [0,T] \rightarrow {\mathbb R}\), \(\mu (0)=\mu _0\) and \(\mathcal{P}({\mathbb R}^n)\) denotes the set of all probability measures on \({\mathbb R}^n\) equipped with the Lévy-Prokhorov metric as defined in [24, Sect. 2.1]. As in [24] and motivated by the aim to obtain a robust control for the ensemble of trajectories modelled by the Liouville equation, we consider the following set of admissible controls

$$\begin{aligned} \mathcal{U}=\lbrace u=(u_1,\ldots u_m) \in L^\infty (0,T,{\mathbb R}^m), \, u(t) \in U \text{ for } \text{ all } t \in [0,T]\rbrace , \end{aligned}$$
(7)

where \(U \subset {\mathbb R}^m\) is a compact set.

Now, we consider a vector field b satisfying the following:

Assumption 2.1

(A1)

$$\begin{aligned} \begin{aligned}&\text{ The } \text{ map } \, b:{\mathbb R}^n \times [0,T]\times \mathbb {R}^m \rightarrow \mathbb {R}^n \text{ is } \text{ continuous; } \\&\text{ There } \text{ are } \text{ constants } \, L,C \text{ such } \text{ that, } \text{ for } \text{ all } x,x' \in {\mathbb R}^n, \, t \in [0,T], \, u \in U, \text{ the } \text{ following }\\&\text{ holds } |b(x,t,u)-b(x',t,u)|\le L \, |x-x'|, \text{ and } \, |b(x,t,u)|\le C \, (1+|x|). \end{aligned} \end{aligned}$$

With this assumption and \(u \in \mathcal{U}\), the ODE problem (1) admits a unique absolutely continuous solution, \(x: [0,T] \rightarrow {\mathbb R}^n\); see the Carathéodory theorem in, e.g., [13]. Notice that the Lipschitz property of b with respect to x means that \(b(\cdot ,t,u) \in W^{1,\infty }(\mathbb {R}^n)\). Therefore Assumption A1 on b is stronger than \(b \in L^1([0,T]; W^{1,\infty }(\mathbb {R}^n))\) in, e.g., [2] (where no u appears).

Next, we define the flow of the vector field b. We have

Definition 2.1

The unique solution \(s \mapsto V^s_t(x)\) to the Cauchy problem

$$\begin{aligned} \begin{aligned}&\dot{y}(s)=b(s,y(s)),\\&y(t)=x , \end{aligned} \end{aligned}$$
(8)

defines the map \((s,t,x)\mapsto V^s_t(x)\), which is called the flow of the vector field b.

Based on the results in [24], we can state the following:

Theorem 2.1

Under Assumption A1 and \(u \in \mathcal{U}\), we have that (6) admits a unique distributional solution \(\mu \in C([0,T];\mathcal{P}({\mathbb R}^n))\). In particular, if \(\mu _0\) is absolutely continuous, then the solution \(\mu (t)\) results absolutely continuous in t, and in this case (4) and (6) are equivalent.

For further results and references on the theory of the Liouville (continuity) equation, we refer to, e.g., [4] and [2, 14].

Next, concerning the optimal control (56), we recall that under Assumption A1 and the condition that \(\mathcal{B}\) is a closed set in \({\mathbb R}^n\), it is proved in [24] (Theorem 1) that (56) has a solution in the space of the so-called generalized controls.

Furthermore in [24], based on a specific structure of the velocity field, it is stated the existence of an optimal control in \(\mathcal{U}\). This structure is given by the following theorem that represents a special case of Corollary 1 in [24]. We have

Theorem 2.2

Let \(b=(b^1,\ldots ,b^n)\) has the form

$$\begin{aligned} b(x,t,u)= b_0(x,t) + \sum _{j=1}^m u_j(t)\, b_j(x,t) , \end{aligned}$$
(9)

where \(b_0=(b_0^1,\ldots ,b_0^n)\) and \(b_j=(b_j^1,\ldots ,b_j^n)\), \(j=1, \ldots , m\), satisfy Assumption A1. Further, assume that the target set \(\mathcal{B}\) be closed. Then (56) has a solution in \(\mathcal {U}\). This theorem also holds if \(u_j\) in (9) is replaced by \(\Phi _j(u_j)\), where the \(\Phi _j\) are convex functions.

Remark 2.1

Since Theorem 9 holds for convex \(\Phi _i\), b could be a non-differentiable function of u. For example one could choose \(\Phi _j(u_j)=|u_j|\). This is illustrated with a numerical example in Sect. 6.

Based on this theorem and Remark 2.1, assuming that the initial probability measure \(\mu _0\) is absolutely continuous with continuous density \(\rho _0\), the optimal control problem for (34) with a velocity field given by (9) has a solution \(u\in \mathcal {U}\).

3 Optimality Conditions for the Liouville Control Problem

In this section, we discuss the characterization of optimal controls by the necessary optimality condition given by the Pontryagin’ maximum principle (PMP) and by the Lagrangian framework and discuss the relationship between these conditions that help us build an efficient numerical strategy for determining an optimal control for our optimization problem (34).

We make the following additional assumptions on the vector field b.

Assumption 3.1

(A2) The map \( b:{\mathbb R}^n \times [0,T]\times \mathbb {R}^m \rightarrow \mathbb {R}^n\), with structure (9), is continuous and \(b(\cdot , t, u) \in H^m(\mathbb {R}^n)\) with \(m > (n+4)/2\).

Notice that, by Sobolev imbedding, Assumption A2 implies that b is twice continuously differentiable in the spatial variable x, as required in [24, Assumption A2].

Next, we state the existence and regularity of solutions to (4) in a standard functional setting.

Theorem 3.1

Under Assumption A2, \(u\in \mathcal {U}\), and for a given initial density \(\rho _0\in H^1({\mathbb R}^n)\), there exists an unique solution \(\rho \in C^1(0,T;H^1({\mathbb R}^n)) \cap C(0,T;H^{2}({\mathbb R}^n))\)

Proof

The proof of the theorem follows from [4, Prop. 2.1].

To characterize the optimal control in terms of necessary optimality conditions, we need the following definition.

Definition 3.1

A compact set \(\mathcal {B} \subset \mathbb {R}^n\) is said to have the interior ball property of radius r, if it can be written as a union of closed balls of radius r.

Now, we state the following necessary optimality condition in the sense of the PMP [15]; see [24] Theorem 2. We have

Theorem 3.2

Let \(\mathcal{B}\) be a compact set with the interior ball property, \(\rho _0 \in C^1({\mathbb R}^n)\) and b satisfy all conditions of Assumption A2. Let \(u^*\) be an optimal control for (34) and \(\rho ^*\) be the corresponding density function. Then, for almost every \(t \in [0,T]\), the following holds

$$\begin{aligned} \int _{\partial \mathcal{B}^t} \, \rho ^*(x,t) \, b(x,t,u^*(t))\cdot \eta _{\mathcal{B}^t}(x)~d\sigma (x) = \min _{w\in U}\int _{\partial \mathcal{B}^t}\, {\rho ^*(x,t)} \, b(x,t,w)\cdot \eta _{\mathcal{B}^t}(x)~d\sigma (x), \end{aligned}$$
(10)

where \(\mathcal{B}^t = \bar{V}_T^t(\mathcal{B})\) with \(\bar{V}\) being the flow of the vector field \((x,t) \mapsto b(x,t,u^*(t))\), \(\eta _{B^t}(x)\) is the measure-theoretic outer unit normal to \(\mathcal{B}^t\) at x, \(\sigma \) is \((n-1)\)-dimensional Hausdorff measure.

Notice that if a \(\partial \mathcal{B}\) is a \(C^2\) surface, then \(\mathcal{B}\) satisfies the interior ball condition. This is also true for domains with \(C^{1,1}\) boundary. In these cases, \(\eta _{\mathcal{B}}(x)\) is the usual outer unit normal to \(\mathcal{B}\).

We notice that optimal control problems with PDEs are usually investigated in the Lagrange framework [30], which provides a easier derivation and interpretation of optimality conditions by relying on differential calculus. On the other hand, our purpose is to develop a numerical framework that relies on the PMP principle without exploiting any differentiability with respect to the control variable. In order to illustrate the relationship between the two optimization frameworks and for later comparison, in the following part of this section, we discuss the Lagrange formalism. For this purpose, under the assumptions of Theorem 3.2, we define the following Lagrange function corresponding to the optimization problem (34). We have

$$\begin{aligned} L(\rho , q, u) = \int _{\mathbb {R}^n} \chi _\mathcal {B}(x)~\rho (x,T) dx - \int _0^T\int _{\mathbb {R}^n}\lbrace \rho _t+\nabla \cdot (b(x,t,u)\rho )\rbrace \, q(x,t)~ dxdt,\quad \end{aligned}$$
(11)

where we introduce the Lagrange multiplier \(q\in L^2(0,T;L^2({\mathbb R}^n))\) and \(u\in \mathcal {U}\). Notice that, with this setting, the second integral in (11) is well-defined.

We denote the dependence of the unique solution of (4) on a given control function \(u\in \mathcal {U}\) by \(\rho = \Lambda (u)\) and one can prove that this mapping is differentiable. We introduce the reduced cost functional \(\hat{J}\) given by

$$\begin{aligned} \hat{J}(u)=J(\Lambda (u),u). \end{aligned}$$
(12)

We now prove the following theorem regarding the gradient of the reduced functional \(\hat{J}(u)\). Notice that, for the purpose of the Lagrange framework, we require differentiability of the Liouville model with respect to u.

Theorem 3.3

The first-order optimality condition for (34) is given by

$$\begin{aligned} - \int _0^T\int _{\mathbb {R}^n} \Bigg \lbrace \nabla \cdot \Bigg (\frac{\partial b}{\partial u} (x,t,u^*(t)) \, \rho ^*(x,t) \Bigg )\Bigg \rbrace \, q^*(x,t) \cdot (u(t)-u^*(t))\, dxdt \le 0,\qquad \forall u\in \mathcal{U}, \end{aligned}$$
(13)

where \(u^*\) denotes the optimal control, \(\rho ^*=\Lambda (u^*)\) represents the solution to (4) with \(u=u^*\) and a given continuous initial density \(\rho _0\). Further, \(q^*=q(u^*)\) denotes the solution to the following adjoint equation

$$\begin{aligned} \begin{aligned}&q_t + b(x,t,{u^*})\cdot \nabla q=0,\\&q(x,T)=\chi _\mathcal {B}(x), \end{aligned} \end{aligned}$$
(14)

where \(\chi _\mathcal{B}\) denotes the characteristic function of the set \(\mathcal{B}\) and by [4, Prop. 2.1], we have \(q\in C^1(0,T;H^1({\mathbb R}^n)) \cap C(0,T;H^{2}({\mathbb R}^n))\).

Proof

We consider the cost functional \((\rho ,u)\rightarrow J(\rho ,u)\) as given in (3). Let \(\delta \rho \) satisfy the linearized constraint

$$\begin{aligned} \delta \dot{\rho }+\nabla \cdot \Bigg (\dfrac{\partial b}{\partial u}(x,t,u)\rho \Bigg ) \delta u + \nabla \cdot (b(x,t,u) \delta \rho ) = 0,~t\in (0,T],~ \delta \rho (0)=0. \end{aligned}$$
(15)

Then, by Theorem 3.1, this equation admits an unique solution \(\delta \rho \in C^1(0,T;H^1({\mathbb R}^n)) \cap C(0,T;H^{2}({\mathbb R}^n))\). Now let us denote with \((\cdot , \cdot )_{L^2} \) as the standard \(L^2(Q)\) inner product. We have

$$\begin{aligned} \begin{aligned} (\nabla _{u_i}\hat{J}(u), \delta \rho )_{L^2}&= \Bigg (\dfrac{\partial J}{\partial \rho }, \delta \rho \Bigg )_{L^2}+\Bigg (\dfrac{\partial J}{\partial u_i}, \delta u_i\Bigg )_{L^2}\\&=(\chi _\mathcal {B}(T), \delta \rho (T))_{L^2}\\&=(q(T), \delta \rho (T))_{L^2},\\ \end{aligned} \end{aligned}$$

where q satisfies (14). Using integration by parts, we have the following

$$\begin{aligned} \begin{aligned} (q(T), \delta \rho (T))_{L^2}&= -(q(0), \delta \rho (0))_{L^2} + (\dot{q}, \delta \rho )_{L^2} + (q, \delta \dot{\rho })_{L^2}\\&=(-b\cdot \nabla q, \delta \rho )_{L^2}-(q, \nabla \cdot (b~ \delta \rho ))_{L^2} -\Bigg ( q,\nabla \cdot \Bigg (\dfrac{\partial b}{\partial u_i}~\rho \Bigg ) \delta u_i\Bigg )_{L^2}\\&= -\Bigg ( q,\nabla \cdot \Bigg (\dfrac{\partial b}{\partial u_i}~\rho \Bigg ) \delta u_i\Bigg )_{L^2}. \end{aligned} \end{aligned}$$

Thus, the reduced gradient \(\nabla _{u}\hat{J}(u)\) is given by

$$\begin{aligned} \nabla _{u}J(u) = -\nabla \cdot \Bigg (\dfrac{\partial b}{\partial u}~\rho \Bigg )q. \end{aligned}$$

The first-order necessary optimality condition can then be stated as follows

$$\begin{aligned} \Bigg ( \nabla _u \hat{J}(u^*), u-u^* \Bigg )_{L^2(0,T;{\mathbb R}^m)} \le 0,\qquad \forall u\in \mathcal {U}; \end{aligned}$$

see, e.g., [30], and thus we have (13). \(\square \)

We remark that the optimality condition (13), considered pointwise in t, corresponds to the differential characterization of the PMP minimization problem (10), where the latter does not involve differentiation with respect to u. To prove this fact, we have the following

$$\begin{aligned} \int _{\mathbb {R}^n} ( \nabla \cdot v(x) ) \, \chi _\mathcal{B}(x) \, dx = \int _{\partial \mathcal{B}} ( v(x) \cdot \eta _{\mathcal{B}}(x) )~d\sigma (x) . \end{aligned}$$
(16)

This calculus rule can be show using the notion of the surface delta function, assuming a sufficiently regular \({\partial \mathcal{B}}\), e.g., \(\partial \mathcal {B}\in C^2\). We use this fact in (10) with \(\mathcal {B}=\mathcal {B}^t\) and notice that \(\chi _{\mathcal{B}^t}\) coincides with the optimal adjoint function \(q^*(x,t)\). Therefore, we obtain the PMP condition (10) in the following form

$$\begin{aligned} \int _{\mathbb {R}^n} \, \nabla \cdot ( b(x,t,u^*(t))\, \rho ^*(x,t)) \, q^*(x,t)~dx = \min _{w\in U}\int _{\mathbb {R}^n} \, \nabla \cdot ( b(x,t,w)\, {\rho ^*(x,t)} ) \, q^*(x,t)~dx . \end{aligned}$$
(17)

Now, we focus on the structure (9). In this case, (17) becomes

$$\begin{aligned} u^*(t)= \mathop {\hbox {arg min}}\limits _{w\in U} \, \sum _{j=1}^m w_j \int _{\mathbb {R}^n} \, \nabla \cdot ( b_j(x,t)\, {\rho ^*(x,t)} ) \, q^*(x,t)~dx . \end{aligned}$$
(18)

Furthermore, the Lagrange approach leads to considering the \(L^2(0,T)\) gradient given by

$$\begin{aligned} \nabla _{u_j} \hat{J}(u) (t) = - \int _{\mathbb {R}^n} \, \nabla \cdot (b_j (x,t) \, \rho (x,t) ) \, q(x,t)\, dx , \qquad j=1,\ldots , m. \end{aligned}$$
(19)

It follows from Theorem 3.3 that, in the Lagrange framework, we can characterize the optimal solution to (17) as the solution to the following optimality system

$$\begin{aligned}&\begin{aligned}&\rho _t + \nabla \cdot (b(x,t,u)\rho ) = 0,\qquad (x,t) \in Q, \qquad \text{(forward } \text{ equation) }\\&\rho (x,0) = \rho _0(x),\qquad x\in \Omega , \\ \end{aligned} \end{aligned}$$
(20)
$$\begin{aligned}&\begin{aligned}&q_t + b(x,t,u)\cdot \nabla q = 0,\qquad (x,t) \in Q,\qquad \text{(adjoint } \text{ equation) }\\&q(x,T) = \chi _{\mathcal {B}(x)} ,\qquad x\in \Omega , \\ \end{aligned} \end{aligned}$$
(21)
$$\begin{aligned}&\begin{aligned} \Bigg ( \nabla _u \hat{J}(u), v-u \Bigg )_{L^2(0,T;{\mathbb R}^m)} \le 0,\qquad \forall v\in \mathcal {U}.\qquad \text{(optimality } \text{ condition) } \end{aligned} \end{aligned}$$
(22)

The requirement that \(\rho _0\) must be continuously differentiable and \(\mathcal{B}\) must have the interior ball property are two limitations of the present optimal control formulation. However, these limitations can be bypassed in the following way [24].

The initial density \(\rho _0\) is replaced by a mollified approximation, using a mollifier \(\varphi =\varphi (x)\) with \(\varphi _\epsilon (x)=\epsilon ^{-n}\, \varphi (x/\epsilon )\), as follows

$$\begin{aligned} \rho _0^\epsilon (x)=(\rho _0 * \varphi _\epsilon )(x) =\int _{{\mathbb R}^n} \varphi _\epsilon (x - y) \rho _0(y)~dy . \end{aligned}$$
(23)

We also have \(\int _{{\mathbb R}^n} \rho _0^\epsilon (x) dx=1\).

For the target set \(\mathcal{B}\) to satisfy the interior ball property, we replace it with a closed \(r\epsilon \) neighbourhood given by

$$\begin{aligned} \mathcal{B}_{r\epsilon } = \{ x \in {\mathbb R}^n \, | \, |x - y| \le r \epsilon \, \text{ for } \text{ some } y \in \mathcal{B}\} . \end{aligned}$$

With these modifications and using results in [24], we can state that any accumulation point of the family of controls of the perturbed problem is optimal for the original non-perturbed problem and the optimal value of the non-perturbed objective is obtained as the limit \(\epsilon \rightarrow 0\) of that of the perturbed problem. In Sect. 6, these properties are implemented in the numerical experiments.

With this remark, we have completed the theoretical discussion on the formulation of the Liouville control problem. In the following sections, we focus on the development of a numerical framework for solving this problem. We consider a one-dimensional setting and provide comments concerning extensions to multi-dimensions.

4 Sanders’ Discretization Scheme

Essential for the numerical solution of our Liouville control problem is an accurate discretization scheme for the forward Liouville problem (20) and for its adjoint equation (21). We thus consider high-order accurate finite-volume (FV) schemes for the Liouville equation.

There is a well-known theory of first-order degeneracy at smooth extrema for high-order accurate TVD finite-volume schemes given in [21]. These schemes are constructed using the TVD property of the solution. To improve the order of accuracy near extrema, Sanders introduced a third-order accurate TVD finite-volume scheme [28]. This scheme is the first known FV scheme in literature that has the unique property of preserving higher-order accuracy even at smooth extrema of the solution. For this purpose, the construction of the scheme of Sanders uses the TVD property of the reconstructing polynomial rather than that of the solution. Later on, Zhang and Shu [32] extended this methodology to construct higher order (up to sixth order) accurate TVD finite-volume schemes for one-dimensional scalar conservation laws.

Sanders shows that his scheme is positive preserving, in the sense that starting from a non-negative density a non-negative solution is obtained. We show that Sanders’ scheme can be written in a conservative form. Moreover, we prove that the Sanders scheme is overall second-order convergent in the \(L^1\) norm. Notice that the focus on the \(L^1\)-norm is natural in the context of density evolution equations, since positivity and conservation of total probability imply that the \(L^1\)-norm of the solution is conserved.

Now, we discuss the scheme of Sanders for solving the Liouville problem. For this purpose, we need the following preparation.

We partition the real line into equally spaced non-overlapping intervals, \(\mathbb {R}=\bigcup _j I_j\), where \(I_j = [x_{j-1/2},x_{j+1/2})\) is shown in Fig. 1. Let \(\bar{I}_j = [x_{j-1/2},x_{j+1/2}]\) and \(I_j^0 = (x_{j-1/2},x_{j+1/2})\). Let h represents the size of each \(I_j\). Notice that \(I_j\) is centered in \(x_{j}\).

Fig. 1
figure 1

The space grid and the interval \(I_j\)

We denote the cell average of \(\rho (x)\) in \(I_j\) with \(\bar{\rho }_j = \frac{1}{h}\int _{I_j} \rho (x)dx\), and the boundary values of \(\rho \) at \(x_{j \pm 1/2}\) are represented by \(\rho _{j \pm 1/2} = \rho (x_{j \pm 1/2})\). Further, we discretize the time interval [0, T] considering time sub-intervals of size \(\Delta t = T/N\), \(N \in {\mathbb N}^+\), and denote with \(t^k = k \Delta t\), \(k=0,\ldots , N\), which are the time steps. In our discretization setting, all dependent variables, e.g., \(\phi \), which are evaluated at the time step \(t^k\) are denoted with \(\phi ^k\). We also consider a piecewise constant approximation to the control function, where we denote with \(u^{k+1/2}\) the value of u in the time interval \((t^k,t^{k+1})\).

Next, we discuss a piecewise quadratic polynomial reconstruction method that is at the base of Sanders’ scheme. The purpose of this procedure is to construct a piecewise quadratic polynomial approximation to \(\rho ^k(x)\) using the cell averages and the cell boundary values of \(\rho ^k(x)\) on \(I_j\). Notice that here the terminology ‘polynomial’ is used for naming the following function

$$\begin{aligned} P^k(x) = \sum _j P_j^k(x) , \qquad \text{ where } \qquad P_j^k(x) = \left\{ \begin{array}{ll} R_j^k(x-x_j), &{} x \in I^0_j, \\ \rho ^k_{j-1/2}, &{} x = x_{j-1/2}, \end{array} \right. \end{aligned}$$
(24)

where

$$\begin{aligned}&R_j^k(x-x_j) = \hat{R}_j^k(\tau _L \widehat{\rho ^k_L},\tau _R \widehat{\rho ^k_R})(\theta )+\bar{\rho }^k_j,\\&\hat{R}_j^k(a,b)(\theta ) = 3(a+b)\theta ^2 + (b-1)\theta -(a+b)/4,\\&\widehat{\rho ^k_L} = \rho ^k_{j-1/2} - \bar{\rho }^k_j,~ \widehat{\rho ^k_R} = \rho ^k_{j+1/2} - \bar{\rho }^k_j,~ \theta = \dfrac{x-x_j}{h}, \end{aligned}$$

and \(\widehat{\rho ^k_L}, \widehat{\rho ^k_R}\) are given by the following recipe:

Let

$$\begin{aligned} M=\max \text{ mod }(\widehat{\rho ^k_L},\widehat{\rho ^k_R}), \, m=\min \text{ mod }(\widehat{\rho ^k_L},\, \widehat{\rho ^k_R}), \, r = \, m/M,~\hat{E}=E/M \text{ for } M\ne 0, \end{aligned}$$

where

$$\begin{aligned} E= {\left\{ \begin{array}{ll} {\sup }_{\bar{I}_j}(\rho ^k-\bar{\rho }^k_j),\qquad \text{ if } M <0,\\ {\inf }_{\bar{I}_j}(\rho ^k-\bar{\rho }^k_j),\qquad \text{ if } M \ge 0,\\ \end{array}\right. } \end{aligned}$$
(25)

and

$$\begin{aligned} \tau _- = -\dfrac{1}{2}\left[ {(r+3\hat{E})-(3(\hat{E}-r)(3\hat{E}+r))^{1/2}}\right] , \qquad \tau _+ = -\hat{E}\cdot \dfrac{3(1+r)}{1+r+r^2}. \end{aligned}$$

If \(-1 \le r\le -1/2\), then

$$\begin{aligned} \tau _L = 1,\qquad \tau _R = 1. \end{aligned}$$

If \(-1/2< r < 0\) and \(\widehat{\rho ^k_R}=M\), then

$$\begin{aligned} \tau _L = 1,\qquad \tau _R = \min (\tau _-,1). \end{aligned}$$

If \(-1/2< r < 0\) and \(\widehat{\rho ^k_L}=M\), then

$$\begin{aligned} \tau _L = \min (\tau _-,1),\qquad \tau _R = 1. \end{aligned}$$

If \(0 \le r\le 1\), then

$$\begin{aligned} \tau _L = \min (\tau _+,1),\qquad \tau _R = \min (\tau _+,1). \end{aligned}$$

Using the above reconstruction technique, the polynomial \(P^k(x)\) has the following properties

$$\begin{aligned} \begin{aligned}&\dfrac{1}{h}\int _{I_j} P^k(x)dx = \bar{\rho }^k_j,\\&\quad TV(P^k(x)) \le TV(\rho ^{k}(x)),\\&\quad \inf (\rho ^k(x))\le \inf (P^k(x))\le \sup (P^k(x)) \le \sup (\rho ^k(x)),\\&\quad P^k(x) = \rho ^k(x) + \mathcal {O}(h^3). \end{aligned} \end{aligned}$$
(26)

The next step in Sanders’ scheme is to model the evolution of the polynomial approximation by half time step, \(t ^k \rightarrow t^{k + 1/2}\). For this purpose, consider the following backward characteristic equation for the x variable

$$\begin{aligned} \dfrac{x_j-x}{\Delta t/2} = b(x,t^k,u^{k+1/2}), \qquad x\in I_j . \end{aligned}$$
(27)

We denote with \(y_j\) its solution.

Fig. 2
figure 2

Staggered grid \(I_{j-1/2}\) at half time step

Fig. 3
figure 3

Trapezoidal region in the \(x-t\) plane

We now consider a staggered cell \(I_{j-1/2}=[x_{j-1},x_j)\) as shown in Fig. 2. The staggered cell endpoint values are evaluated as \(P^{k+1/2}(x_j) = P^k(y_j)\).

Using the divergence theorem in the trapezoid \((x_{j-1},t^k+\Delta {t}/2), (x_{j},t^k+\Delta {t}/2), (y_{j-1},0), (y_{j},0),\) the cell average of \(P^{k+1/2}(x)\) on \(I_{j-1/2}\) is evaluated by the formula

$$\begin{aligned} \begin{aligned} \bar{P}^{k+1/2}_{j-1/2} =&\dfrac{1}{h}\Bigl [\int _{y_{j-1}}^{y_j}P^k(x)dx- \dfrac{\Delta t}{2}~b(x,t^k,u^{k+1/2})P^k(y_j)+(x_j-y_j)P^k(y_j)\\&+\dfrac{\Delta t}{2}~b(x,t^k,u^{k+1/2})P^k(y_{j-1})-(x_{j-1}-y_{j-1})P^k(y_{j-1})\Bigr ] . \end{aligned} \end{aligned}$$
(28)

Notice that for computing E in (25), the values \(\sup _{I_{j-1/2}}P^{k+1/2}(\cdot ),~\inf _{I_{j-1/2}}P^{k+1/2}(\cdot )\) are required. But, in general, they are difficult to calculate. For this reason, we replace these values with the following ones

$$\begin{aligned} \sup _{[y_{j-1},y_j]} P^k(\cdot ),\qquad \inf _{[y_{j-1},y_j]} P^k(\cdot ). \end{aligned}$$

At this point, the illustration of Sanders’ procedure for determining the evolution to \(t^n+\Delta t/2\) and the reconstruction on the staggered mesh \(I_{j-1/2}\) is completed.

In an analogous way, we implement the evolution from \(t^k+\Delta t/2\) to \(t^n+\Delta t\) to get the solution \(\rho ^{k+1}\) on the mesh \(I_j\). We then compute the cell average of \(\rho ^{k+1}\) as follows

$$\begin{aligned} \begin{aligned} \bar{\rho }_j^{k+1}&=\bar{Q}^{k+1}_{j} = \dfrac{1}{h}\left[ \int _{z_{j-1/2}}^{z_{j+1/2}}Q^{k+1/2}(x)dx- \dfrac{\Delta t}{2}~b(x,t^{k+1/2},u^{k+1/2})Q^{k+1/2}(z_{j+1/2})\right. \\&\quad +(x_{j+1/2}-z_{j+1/2})Q^{k+1/2}(z_{j+1/2}) +\dfrac{\Delta t}{2}~b(x,t^{k+1/2},u^{k+1/2})Q^{k+1/2}(z_{j-1/2})\\&\left. \quad -\,(x_{j-1/2}-z_{j-1/2})Q^{k+1/2}(z_{j-1/2})\right] , \end{aligned} \end{aligned}$$
(29)

where \(Q^{k+1/2}\) is the polynomial reconstruction of \(P^{k+1/2}(x)\) and \(z_{j-1/2}\) is the unique solution to the backward characteristic equation

$$\begin{aligned} \dfrac{x_{j-1/2}-x}{\Delta t/2} = b(x,t^{k+1/2},u^{k+1/2}), \qquad x\in I_{j - 1/2} ; \end{aligned}$$
(30)

see Fig. 3.

We summarize the above procedure in the following algorithm.

Algorithm 4.1

(Sanders TVD FV scheme).

  1. 1.

    Let the initial condition be given as \(\rho _0(x)\). Let the cell averages and cell boundary values corresponding to \(\rho _0\) be denoted as \(\bar{\rho }^0_j\) and \(\rho ^0_{j+1/2}\) respectively.

  2. 2.

    For time steps \(k= 0,1,2,\ldots ,\) the following procedure is implemented

    1. (a)

      Construct a piecewise quadratic polynomial approximation to \(\rho ^k(x)\) as in (24).

    2. (b)

      Solve the backward characteristic equation (27) in \(I_j\) to get an unique solution \(y_j\). Consider the staggered cell \(I_{j-1/2}\). The staggered cell endpoint values are evaluated as \(P^{k+1/2}(x_j) = P^k(y_j)\) and the cell average of \(P^{k+1/2}(x)\) on \(I_{j-1/2}\) is evaluated by the formula

      $$\begin{aligned} \begin{aligned} \bar{P}^{k+1/2}_{j-1/2}&= \dfrac{1}{h}\left[ \int _{y_{j-1}}^{y_j}\, P^k(x)dx- \dfrac{\Delta t}{2}~b(y_j,t^k,u^{k+1/2})\, P^k(y_j)+(x_j-y_j)\, P^k(y_j)\right. \\&\left. \quad +\dfrac{\Delta t}{2}~b(y_{j-1},t^k,u^{k+1/2})\, P^k(y_{j-1})-(x_{j-1}-y_{j-1}) \, P^k(y_{j-1})\right] \end{aligned} \end{aligned}$$
      (31)
    3. (c)

      Using the values \(\sup _{[y_{j-1},y_j]} P^k(\cdot ),\qquad \inf _{[y_{j-1},y_j]} P^k(\cdot )\), a piecewise parabolic reconstruction of \(P^{k+1/2}(x)\), denoted as \(Q^{k+1/2}\), is constructed.

    4. (d)

      With an unique solution \(z_{j+1/2}\) of (30) in \(I_{j+1/2}\), the cell average of \(\rho ^{k+1}\) is computed by the following formula in \(I_j\). The cell endpoint values are then evaluated as \(\rho ^{k+1}(x_{j-1/2}, t^k+\Delta t) = Q^k(z_{j-1/2})\), and the cell average of \(\rho ^{k+1}\) is given by

      $$\begin{aligned} \begin{aligned} \bar{\rho }_j^{k+1}=\bar{Q}^{k+1}_{j} =&\dfrac{1}{h}\Bigl [\int _{z_{j-1/2}}^{z_{j+1/2}}Q^{k+1/2}(x)dx- \dfrac{\Delta t}{2}~b(z_{j+1/2},t^{k+1/2},u^{k+1/2})\, \\&\times Q^{k+1/2}(z_{j+1/2}) +(x_{j+1/2}-z_{j+1/2})\, Q^{k+1/2}(z_{j+1/2})\\&+\dfrac{\Delta t}{2}~b(z_{j-1/2},t^{k+1/2},u^{k+1/2})\, Q^{k+1/2}(z_{j-1/2})\\&\quad -(x_{j-1/2}-z_{j-1/2})\, Q^{k+1/2}(z_{j-1/2})\Bigr ]. \end{aligned} \end{aligned}$$
      (32)
    5. (e)

      Go to (a) to compute the solution for the next time step.

Now, we prove that Sanders’ scheme is conservative.

Theorem 4.1

(Conservative form). The scheme of Sanders [28], given by Algorithm 4.1 can be written in a conservative form.

Proof

Using (28), we can write the cell averages at time \(t^k+\Delta t/2\) as

$$\begin{aligned} \bar{P}^{k+1/2}_{j-1/2} = \bar{P}^k_{j-1/2}-\dfrac{1}{2}\dfrac{\Delta t}{h} (\hat{f}_j-\hat{f}_{j-1}), \end{aligned}$$
(33)

where the numerical flux is given as follows

$$\begin{aligned} \hat{f}_j = \dfrac{2}{\Delta t} \int _{y_j}^{x_j} P^k(x)dx + b(y_j,t^k,u^{k+1/2}) \, P^k_j-dfrac{2(x_j-y_j)}{\Delta t}P^k_j. \end{aligned}$$

Again from (29), we can write the cell averages at time \(t^{k}+\Delta t\) as

$$\begin{aligned} \bar{\rho }^{k+1}_j=\bar{Q}^{k+1}_j = \bar{Q}^{k+1/2}_{j}-\dfrac{1}{2}\dfrac{\Delta t}{h} (\hat{h}_{j+1/2}-\hat{h}_{j-1/2}), \end{aligned}$$
(34)

where the numerical flux is given as follows

$$\begin{aligned} \hat{h}_{j+1/2}= & {} \dfrac{2}{\Delta t} \int _{z_{j+1/2}}^{x_{j+1/2}} Q^{k+1/2}(x)dx + b(z_{j+1/2},t^{k+1/2},u^{k+1/2}) \, Q^{k+1/2}_{j+1/2}\\&-\dfrac{2(x_{j+1/2}-z_{j+1/2})}{\Delta t} \, Q^{k+1/2}_{j+1/2}. \end{aligned}$$

The fluxes \(\hat{f}\) and \(\hat{h}\) are consistent in the sense that \(\hat{f}_j =b(y_j,t^k,u^{k+1/2})\bar{\rho }_1\) and

\(\hat{h}_{j+1/2} =b(z_{j+1/2},t^{k+1/2},u^{k+1/2})\bar{\rho }_2\) for constants \(\bar{\rho }_1, \bar{\rho }_2\). Thus the scheme of Sanders described above is conservative. \(\square \)

Next, we state the positivity of Sanders scheme which has been proved in [28, Lemma 3.2]. In our case, positivity of Sanders scheme means that

$$\begin{aligned} \rho ^0_j \ge 0 \, \Rightarrow \, \rho ^k_j \ge 0,\qquad \forall (x_j,t^k), \end{aligned}$$

where \(\rho ^k_j\) represents the numerical solution opbtained using Sanders scheme at the cell center \(x_j\) and time \(t^k\). For positivity, the following conditions are enforced.

$$\begin{aligned} \max _{x\in \Omega }| b'(x,t,u)| \lambda < 1,\qquad \forall t\in [0,T], \end{aligned}$$
(35)

where \(\lambda =\dfrac{\Delta t}{h}\).

Theorem 4.2

(Positivity). The Sanders scheme given in Algorithm 4.1 is positive under the condition (35).

Next, we prove \(L^1\) convergence of Sanders’ scheme. For this purpose, we state and prove the following apriori estimate using techniques similar to those in [19].

Theorem 4.3

(Apriori estimate). Let \(\rho \) be the exact solution to the Liouville equation (20) and \(\lbrace \rho _h \rbrace \) be the corresponding approximate solution computed by Algorithm 4.1. Let

$$\begin{aligned} F:= \partial _t \rho _h + \nabla \cdot (b\rho _h), \end{aligned}$$

denotes the truncation error. Then, for a finite time \(0\le T \le T_0\), there exist constants \(C_0=C_0(T)\) and \(C_1=C_1(T)\) such that the following apriori estimate holds

$$\begin{aligned} \Vert \rho _h (\cdot ,T)- \rho (\cdot ,T)\Vert _{L^1} \le C_0 \Vert \rho _h(\cdot ,0)- \rho _0\Vert _{L^1}+C_1\Vert F(\cdot ,\cdot )\Vert _{L^1(Q)}. \end{aligned}$$
(36)

Proof

Let \(e(x,t) = \rho _h(x,t)-\rho (x,t)\) denotes the error in the numerical solution. Then e satisfies the following equation

$$\begin{aligned} \begin{aligned}&\partial _t e + (\nabla \cdot b)e +b\cdot \nabla e = F. \end{aligned} \end{aligned}$$
(37)

In order to study the stability of (37), we consider its dual equation

$$\begin{aligned} \begin{aligned}&\partial _t \psi -(\nabla \cdot b)\psi + \nabla \cdot (b\psi ) = 0,\\&\psi (x,T)=\psi _T(x), \end{aligned} \end{aligned}$$
(38)

where the backward initial condition \(\psi _T(x)\) is smooth and compactly supported in Q and \(\psi (x,t)\) is compactly supported for all t. To obtain the \(L^1\) estimate (36), we need the solution of the error equation (37) to be in \(L^\infty (0,T;L^1(\mathbb {R}^n))\) and the solution of the dual equation (38) in \(L^\infty (0,T;L^\infty (\mathbb {R}^n))\). For this purpose, we notice that Assumption A1 implies that \(b(\cdot ,t,u) \in W^{1,\infty }(\mathbb {R}^n)\). Under this condition, there exists a solution of (37) in \(L^\infty (0,T;L^1(\mathbb {R}^n))\) and a solution of (38) in \(L^\infty (0,T;L^\infty (\mathbb {R}^n))\) as proved in [14, Prop. II.I].

Multiplying (37) with \(\psi \) and (38) with e, taking \(L^2(\Omega )\) inner product of e and \(\psi \) and adding both the equations and using the compact support of \(\psi \), we obtain the following

$$\begin{aligned} \dfrac{d}{dt}\int _\Omega e \psi = \int _\Omega F\psi . \end{aligned}$$

Integrating over [0, T], we have

$$\begin{aligned} \int _\Omega e(\cdot ,T) \psi _T(\cdot ) = \int _\Omega e(\cdot ,0) \psi (\cdot ,0)+ \int _0^T\int _\Omega F\psi . \end{aligned}$$

Using Hölder’s inequality, we get

$$\begin{aligned} \left| \int _\Omega e(\cdot ,T) \psi _T(\cdot ) \right| \le \Vert e(\cdot ,0)\Vert _{L^1(\Omega )} \Vert \psi (\cdot ,0)\Vert _{L^\infty (\Omega )}+ \Vert F\Vert _{L^1(Q)}\Vert \psi (\cdot ,t)\Vert _{L^\infty (\Omega )}. \end{aligned}$$

Thus, we have

$$\begin{aligned} \Vert e(\cdot ,T)\Vert _{L^1} = \sup _{\Vert \psi _T\Vert _{L^\infty }=1}\left| \int _\Omega e(\cdot ,T) \psi _T(\cdot ) \right| \le C_0(T)\Vert e(\cdot ,0)\Vert _{L^1(\Omega )} + C_1(T)\Vert F\Vert _{L^1(Q)}, \end{aligned}$$

where

$$\begin{aligned} C_0(T) = \sup _{\psi _T} \dfrac{\Vert \psi (\cdot ,0)\Vert _{L^\infty }}{\Vert \psi _T\Vert _{L^\infty }}, \qquad C_1(T) = \sup _{\psi _T}\dfrac{\Vert \psi (\cdot ,t)\Vert _{L^\infty }}{\Vert \psi _T\Vert _{L^\infty }}. \end{aligned}$$

It remains to estimate \(\Vert \psi (\cdot ,t)\Vert _{L^\infty }\) in terms of \(\psi _T(x)\), for all \(0 \le t \le T\). Multiplying (38) by \(sgn(\psi )\), we get the following

$$\begin{aligned} \dfrac{d}{dt}\Vert \psi (\cdot ,t)\Vert _{L^\infty }\le 2 \Vert \nabla \cdot b\Vert _{L^\infty (\Omega )}(t)\Vert \psi (\cdot ,t)\Vert _{L^\infty }. \end{aligned}$$

By Assumption A1 given by (2.1), \(\Vert \nabla \cdot b\Vert _{L^\infty } \le k(t)\) for all t and thus we have

$$\begin{aligned} \dfrac{d}{dt}\Vert \psi (\cdot ,t)\Vert _{L^\infty } \le k(t)\Vert \psi (\cdot ,t)\Vert _{L^\infty }. \end{aligned}$$

Using Gronwall’s inequality, we obtain the following

$$\begin{aligned} \Vert \psi (\cdot ,t)\Vert _{L^\infty } \le \Vert \psi _T\Vert _{L^\infty } ~\exp \left( {\int ^T_t k(s) ds}\right) = D(T) \Vert \psi _T\Vert _{L^\infty }, \qquad \forall t\in [0,T]. \end{aligned}$$

Choosing \(C_0(T) = C_1(T) = D(T)\), we have the desired result.

Since the scheme given by Algorithm 4.1 is a Godunov-type scheme, it has the following recursive form

$$\begin{aligned} \rho _h(\cdot ,t) = \left\{ \begin{array}{ll} \mathcal{E}(t-t^{k-1}) \, \rho _h(\cdot ,t^{k-1}),&{} \qquad t^{k-1}<t<t^k,\\ P_h \, \rho _h(\cdot ,t^k-0),&{} \qquad t=t^k, \end{array} \right. \end{aligned}$$
(39)

where \(k=1, \ldots , N\), and subject to the initial data

$$\begin{aligned} \rho _h(\cdot ,0) = P_h \rho _0(x). \end{aligned}$$

Notice that \(\mathcal{E}\) represents the exact evolution operator associated with the Liouville equation (4) and \(P_h\) is the discrete projection operator defined through the reconstruction step described in Algorithm 4.1.

It has been shown in [19, Lemma 2.1] that the truncation error F, defined in Theorem 4.3, satisfies the following inequality

$$\begin{aligned} \Vert F(\cdot ,\cdot )\Vert _{L^1(Q)} \le \dfrac{T}{\delta {t}}\max _{0< t^k \le T}\Vert (I-P_h) \, \rho _h(\cdot ,t^k-0)\Vert _{L^1(\Omega )}. \end{aligned}$$
(40)

Finally, we need an estimate for the right-hand side of (40). This is given in [28, Lemma 3.3] and can be stated as follows

Lemma 4.1

Suppose \(\rho _0\in C^4\) and \(\delta {t}\) is chosen such that (35) is satisfied. Then we have

$$\begin{aligned} \Vert (I-P_h) \, \rho _h(\cdot ,t^k-0)\Vert _{L^1(\Omega )} \le Ch^3, \qquad \forall 0< t^k \le T. \end{aligned}$$

Using Lemma 4.1, we can reformulate (40) as follows

$$\begin{aligned} \Vert F(\cdot ,\cdot )\Vert _{L^1(Q)} \le CT\dfrac{h}{\Delta {t}}~h^2= CT\frac{1}{\lambda }~ h^2 ; \end{aligned}$$
(41)

see [28, eq. (3.5)].

Using Theorem 4.3 and Lemma 4.1, we get the following convergence result for the Sanders scheme.

Theorem 4.4

The scheme described in Algorithm 4.1 is second-order accurate in the \(L^1\)-norm as follows

$$\begin{aligned} \Vert \rho _h(x,T)-\rho (x,T)\Vert _1 \le D(T)h^2, \end{aligned}$$

under the CFL condition (35).

Proof

Since in our numerical scheme we use the exact initial condition, from (36) we have

$$\begin{aligned} \Vert \rho _h (\cdot ,T)- \rho (\cdot ,T)\Vert _{L^1} \le C_1\Vert F(\cdot ,\cdot )\Vert _{L^1(Q)}. \end{aligned}$$

Now (41) gives us

$$\begin{aligned} \Vert F(\cdot ,\cdot )\Vert _{L^1(Q)} \le CT\frac{1}{\lambda }~ h^2. \end{aligned}$$

Thus we get

$$\begin{aligned} \Vert \rho _h (\cdot ,T)- \rho (\cdot ,T)\Vert _{L^1} \le CT\frac{1}{\lambda }~ h^2 = D(T) h^2, \end{aligned}$$

which proves the desired result. \(\square \)

We complete this section presenting results of numerical experiments that validate our theoretical estimate of the order of accuracy of the Sanders’s scheme, using a smooth initial condition with extrema. We consider the following test case

$$\begin{aligned} \rho _t+ \rho _x=0, \qquad (x,t)\in (0,1)\times (0,2], \end{aligned}$$

with the initial condition

$$\begin{aligned} \rho _0(x) = \sin (2\pi x) \end{aligned}$$

and periodic boundary conditions. Figure 4 shows the plots of the exact and numerical solutions at the times \(t=2,6\); one can see that the two solutions overlap.

Fig. 4
figure 4

Plots of solution at various times

We also compute the relative \(L^1\) error using different mesh sizes. Table 1 shows that the order of accuracy of the Sanderss scheme is 3. Notice that there are only finitely many extrema and thus there is no loss in the rate of \(L^1\) convergence.

5 Numerical Optimization

In Sect. 3, we have discussed the characterization of an optimal control in the PMP framework, which we have conveniently re-written in the form (18), that is, find \(u(t) \in U\) that solves the following minimization problem

$$\begin{aligned} \min _{w\in U} \, \sum _{j=1}^m w_j \int _{\mathbb {R}^n} \, \nabla \cdot ( b_j(x,t)\, {\rho ^*(x,t)} ) \, q^*(x,t)~dx . \end{aligned}$$
(42)

Notice that, since the control u is independent of the spatial variable x, the term under the integral in (42) does not depend explicitly on the control variable.

Table 1 Relative errors measured with the \(L^1\)-norm with different mesh sizes at time \(t=2\)

Now, we use (42) as the starting point for formulating an iterative scheme to solve our optimization problem at all times. In fact, we notice that (42) requires knowledge of the optimal \(\rho ^*\) and \(q^*\), which in turn require knowledge of the optimal control \(u^*\). For this reason, one may attempt to use (42) iteratively in the following way. Given an initial guess for the control u, say \(\widetilde{u}\), we could compute the corresponding \(\rho =\rho (\widetilde{u})\) and \(q=q(\widetilde{u})\), replace these functions in (42) and solve the resulting minimization problem to get an update for u. With this new approximation to the optimal control \(u^*\), we could repeat this fix point procedure and, in case of convergence, determine \(u^*\).

However, in our case, the procedure just described does not converge. The reason appears to be that (42) results in a linear programming problem whose solution is always at the boundary of the polyhedron U and very sensitive to the current values of \(\rho \) and q.

On the other hand, we recognize that (42) is equivalent to the following non-linear optimization problem

$$\begin{aligned} u^*(t)=\mathop {\hbox {arg min}}\limits _{w\in U} \, \sum _{j=1}^m w_j \int _{\mathbb {R}^n} \, \nabla \cdot ( b_j(x,t)\, {\rho (u^*)(x,t)} ) \, q(u^*)(x,t)~dx . \end{aligned}$$

This fact suggests that we should explicitly take into account the dependence of \(\rho \) and q on the control u, namely \(\rho =\rho (u)\) and \(q=q(u)\). This is also the key consideration in the development of direct methods for ODE control problems, which we prefer to avoid since they result cumbersome in a PDE setting. For this reason, we aim at developing an iterative procedure that takes advantage of the fact that, numerically, the discretization of the forward and adjoint Liouville problems provide local maps, \(\rho ^k_h = \rho ^k_h(w)\) and \(q^k_h = q^k_h(w)\), where w denotes the value of the control used to compute \(\rho _h\) and \(q_h\) at \(t^k\), assuming that the other values of \(\rho _h\) and \(q_h\) entering in the scheme are known these functions are already known at the remaining time steps. Therefore, our scheme is designed to solve the following optimal control problem

$$\begin{aligned} \min _{w\in U} \, \sum _{j=1}^m w_j \int _{\mathbb {R}^n} \, \nabla \cdot ( b_j(x,t^k)\, {\rho (w)(x,t^k)} ) \, q(w)(x,t^k)~dx . \end{aligned}$$
(43)

at each \(t^k\) seperately, and assuming that \(\rho ,q\) and u correspond everywhere else with the optimal solution. However, this reasoning reveals that our PMP problem is a nonlinear programming problem. This observation is the starting point for the discussion that follows.

To illustrate our optimization method, we consider a one-dimensional problem and a simple discretization scheme. Later, we return to Sanders scheme.

Assume that the functions being integrated in (43) have compact supports in a bounded domain \(\Omega \subset {\mathbb R}\). We denote with \(\rho _i^k(w)\) the finite difference approximation to \(\rho (w)(x_i,t^k)\), where \(x_i \in \Omega _h\); we use a similar notation for the adjoint variable. For simplicity, we also chose \(m=1\), \(b_0=0\) and \(b_1=1\) in (9) and assume that U is such that \(w \ge 0\).

Since we assume a piecewise constant control function \(u=u^{k+1/2}\) in \((t^k, t^{k+1})\), corresponding to the time step \(t^k\), we use an average of the forward and the adjoint solutions at \(t^k\) and \(t^{k+1}\) to approximate (43) and obtain the following optimization problem

$$\begin{aligned} \min _{w\in U} \, w \sum _{x_i \in \Omega _h} \, \dfrac{h}{2} \, [D_x^+ \, (\rho _i^k(w)+\rho _i^{k+1}(w))] \, \dfrac{1}{2}[q_i^k(w)+q_i^{k+1}(w)] , \end{aligned}$$
(44)

where \(D_x^+ v_i= (v_{i+1}-v_i)/h\). We also need to introduce \(D_x^- v_i= (v_i - v_{i-1})/h\).

Now, consider the following first-order upwind scheme to construct a discrete control-to-state map

$$\begin{aligned} \rho _i^k(w) = \rho _i^{k-1} - w \, \Delta t \, D_x^- \rho _i^{k-1} . \end{aligned}$$

Similarly for the adjoint variable, we have

$$\begin{aligned} q_i^k(w) = q_i^{k+1} + w \Delta t \, D_x^+ q_i^{k+1} . \end{aligned}$$

Next, we use these maps in (44) and obtain the following

$$\begin{aligned}&\min _{w\in U} \, w \sum _{x_i \in \Omega _h} \, \dfrac{h}{2} \, \left( D_x^+ \, \left( \rho _i^{k}+\rho _i^{k-1} - w \Delta t \, D_x^- (\rho _i^k+\rho _i^{k-1})\right) \right) \\&\times \dfrac{1}{2} (q_i^{k+1}+q_i^k + w \Delta t \, D_x^+ (q_i^{k+1}+q_i^k)) . \end{aligned}$$

One can recognize that a cubic polynomial in w is obtained.

A similar result is obtained in two dimensions and we argue that this is true in all dimensions. This result suggests that the solution of the nonlinear programming problem above may equally well belong to the interior of U or to its boundary. To solve (44), one can use, among others, the method developed in [16]. In the one dimensional case, we consider a uniform grid of values in U and find the minimum by direct search.

Next, we illustrate our PMP optimization step using Sanders scheme as given in Algorithm 4.1.

Using (34) and the definition of \(\bar{Q}^{n+1/2}\), we have the following local discrete control-to-state map

$$\begin{aligned} \begin{aligned} \bar{\rho }^{k}_i(w)&= P^{k-1}(y_j)-\dfrac{1}{2}\dfrac{\Delta t}{h}\Bigg [ \dfrac{2}{\Delta t} \int _{z_{i+1/2}}^{x_{i+1/2}} Q^{k-1/2}(x)dx -\dfrac{2}{\Delta t} \int _{z_{i-1/2}}^{x_{i-1/2}} Q^{k-1/2}(x)dx\\&\quad -\dfrac{2(x_{i+1/2}-z_{i+1/2})}{\Delta t}Q^{k-1/2}_{i+1/2}+\dfrac{2(x_{i-1/2}-z_{i-1/2})}{\Delta t}Q^{k-1/2}_{i-1/2}\\&\quad + w \, (Q^{k-1/2}_{i+1/2}- Q^{k-1/2}_{i-1/2})\Bigg ] . \end{aligned} \end{aligned}$$
(45)

Similarly, we obtain the following local control-to-adjoint map

$$\begin{aligned} \begin{aligned} \bar{q}^{k}_i(w)&= P^{k+1}(y_j)-\dfrac{1}{2}\dfrac{\Delta t}{h}\Bigg [ \dfrac{2}{\Delta t} \int _{z_{i+1/2}}^{x_{i+1/2}} Q^{k+1/2}(x)dx -\dfrac{2}{\Delta t} \int _{z_{i-1/2}}^{x_{i-1/2}} Q^{k+1/2}(x)dx\\&\quad -\dfrac{2(x_{i+1/2}-z_{i+1/2})}{\Delta t}Q^{k+1/2}_{i+1/2}+\dfrac{2(x_{i-1/2}-z_{i-1/2})}{\Delta t}Q^{k+1/2}_{i-1/2}\\&\quad - w \, (Q^{k+1/2}_{i+1/2}- Q^{k+1/2}_{i-1/2})\Bigg ] . \end{aligned} \end{aligned}$$
(46)

We use these maps in (44) and obtain a cubic polynomial in w.

The solution \(w^*\) to this mimization problem provides the update to the control at \(t^{k+1/2}\), \(u^{k+1/2}=w^*\). Correspondingly, we can insert this value in (45) to obtain an update to \(\bar{\rho }^{k}\), and in (46) to obtain an update to \(\bar{q}^k\). However, we need to take care of the fact that the state variable evolves forward in time, while the adjoint variable evolves backwards. Therefore a complete update sweep consists of a forward sweep where \(u^{k+1/2}\) and \(\bar{\rho }^{k}\) are updated, while the adjoint variable remains unchanged; followed by a backward sweep, where the control is again updated together with the adjoint variable. Notice that only in this way the initial condition for \(\rho \) and the terminal condition for q are correctly implemented.

Remark 5.1

For the discussion above, we have assumed a drift b that is divergence free. In this case, the adjoint equation is also conservative and thus we can use the Sanders’ scheme. In the case of a non-divergence free drift, we write the adjoint equation \(q_t + b\cdot \nabla q=0\) as follows \(q_t + \nabla \cdot (bq) = (\nabla \cdot b)q\). Then \(q^k = q^{k+1}_S + \delta t (\nabla \cdot b(x, t^{k+1}, u^{k+1/2}))q^{k+1}\), where \(q^{k+1}_S\) represents the approximation of conservative evolution using the Sanders’ scheme. Results of numerical experiments with smooth solutions show that this approach is at least second-order accurate.

In the following, we present a pseudo code that implements our time-splitted nonlinear collective update.

Algorithm 5.1

[Time-splitted nonlinear collective update (TSNCU) scheme]. Input: Number of iterations M, initial guess for the control, \(u=u_0\): compute the corresponding \(\rho _0\) and \(q_0\).

  1. 1.

    for \(m=0, \ldots , M\) (outer iteration loop)

  2. 2.

    for \(n=1, \ldots , N\) (first inner iteration loop; forward update)

  3. 3.

    Solve (44) to obtain \(u^{n+1/2}_{m+1}\), and update \(\bar{\rho }^{n+1}\) using (45).

  4. 4.

    end

  5. 5.

    for \(k=N-1, \ldots , 0\) (second inner iteration loop; backward update)

  6. 6.

    Solve (44) to compute \(u^{k+1/2}_{m+1}\), and update \(q^k\) using (46).

  7. 7.

    end

  8. 8.

    break if convergence criteria is fulfilled: \(\Vert u_{m+1}-u_{m}\Vert _{L_{\delta t}^2(0,T)} < \epsilon \) .

  9. 9.

    end (go to repeat the outer iteration)

Our TSNCU scheme performs very different from a gradient-based scheme. For comparison, we use the reduced gradient given by (19) and implement a standard gradient-based optimization procedure to maximize the objective. In particular, a steepest ascent scheme implements the following update step

$$\begin{aligned} u^{(l+1)} = u^{(l)} + \alpha \, \nabla _{u} \hat{J}(u^{(l)}) , \end{aligned}$$
(47)

where k is a index of the iteration step.

Notice that this gradient procedure should be combined with a projection step onto \(\mathcal{U}\) and requires a line search to estimate \(\alpha \). Therefore, we consider the following

$$\begin{aligned} u^{(k+1)} = P_{U}\left[ u^{(k)} + \alpha \, \nabla _u \hat{J}(u^{(k)}) \right] . \end{aligned}$$
(48)

We remark that this optimization step updates u for all time steps in (0, T), whereas the PMP optimality condition is meant pointwise in time.

Notice that in the case of a \(\rho _0\), and consequently \(\rho \), with compact support, the reduced gradient becomes zero for a control u such that \(\mathrm{supp}\rho (u)(\cdot ,t) \cap \mathcal{B}(u)^t = \emptyset \) for all \(t \in [0,T]\). A similar problem is encountered with the PMP formulation. We could say that the gradient is not zero only in a sufficiently small neighbourhood of \(u^*\) in \(\mathcal{U}\). Thus the optimization landscape resembles very much a golf playing area. This is quite different of most PDE optimization problems where the gradient is always non zero.

To solve this problem, one can consider replacing \(\rho _0\) with compact support, with its mollification as discussed in (23) using

$$\begin{aligned} \varphi (x)=\frac{1}{\sqrt{2 \pi }} \, \exp (-x^2/2) . \end{aligned}$$
(49)

Notice that this function satisfies all conditions for being a mollifier, apart from not having compact support. This latter property is required to overcome our problem of an almost flat optimization landscape. In fact, we have

$$\begin{aligned} \mathrm{supp}\rho _0^\epsilon \subset \mathrm{supp}\rho _0 + \mathrm{supp}\varphi _\epsilon , \end{aligned}$$

where \(+\) denotes the Minkowski addition. Thus using the Gaussian mollifier above, the support of \(\rho _0^\epsilon \) is everywhere non zero. Moreover, when necessary, we assume to perturb the target set \(\mathcal{B}\) such that \(\partial \mathcal{B}_{r\epsilon }\) is at least \(C^{1,1}\) and the outer normal is well defined. We note that these two modifications may also be needed for the existence of minimizers as described in Sect. 3.

6 Numerical Experiments

In this section, we consider different test cases to discuss the efficiency and robustness of the time-spitted nonlinear collective update scheme. In particular, we compare our TSNCU scheme with a projected nonlinear conjugate gradient (PNCG) scheme, and demonstrate the ability of the former to compute optimal controls for Liouville equations with velocity field b a non-differentiable function of u. For clarity, we define different test cases.

In our test Case 1, we consider a Liouville problem with \(b(x,t,u)=u\). Let \(U=[u_a, u_b]\) and \(\mathcal{B}=[r_T,s_T]\). The optimal control without constraints is given by \(\bar{u}=(r_T+s_T)/(2T)\).

For a given \(u=u(t)\), we have \(\mathcal{B}^t=[r_t,s_t]\), where

$$\begin{aligned} r_t=r_T + \int _T^t u(\tau ) d\tau , \qquad s_t=s_T + \int _T^t u(\tau ) d\tau . \end{aligned}$$

Notice that these equations define the map \(u \mapsto \mathcal{B}=\mathcal{B}(u)\), such that \(s_t=s_t(u)\) and \(r_t=r_t(u)\). Further, we have \(\eta _{\mathcal{B}^t}(r_t)=-1\) and \(\eta _{\mathcal{B}^t}(s_t)=1\).

Now, we present results of numerical experiments with the PNCG scheme and the TSNCU scheme. Our first experiment considers the setting above with control constraints, \(u_a=-1\) and \(u_b=2.5\). We choose \(T=2\) and \(B=\chi _{[2,3]}\). To define the initial density, we introduce the following function

$$\begin{aligned} \rho _0(x)= {\left\{ \begin{array}{ll} &{}1,\qquad -c^{1/2}< x+2 < c^{1/2},\\ &{}0, \qquad \text{ otherwise, }\\ \end{array}\right. } \end{aligned}$$
(50)

where \(c=(3/4)^{2/3}\). With this function, we define \(\rho _0^\epsilon =\rho _0 * \varphi _\epsilon \) with \(\varphi \) given by (49).

We consider our Liouville control problem in the spatial domain \(\Omega =(-8,8)\), which is uniformly discretized into 100 subintervals. The time interval (0, T) is discretized into 100 uniformly distributed time steps. The computations are implemented with MATLAB on an INTEL I7 2.3 GHz processor and 6 GB RAM.

We start the PNCG procedure using an initial guess \(u(t)=1.5\), which appears reasonably close to the optimal solution. Nevertheless, for the initial condition \(\rho _0\), the PNCG scheme fails to converge. For this reason, we use a continuation technique. We consider the initial density \(\rho _0^\epsilon \) with \(\epsilon =3\). With this setting the PNCG scheme converges and we obtain the solution given in Fig. 5.

Fig. 5
figure 5

Case 1: PNCG solution \(\rho \) (left) at \(t=T\) with initial condition \(\rho _0^\epsilon \) for \(\epsilon =3\). Right: the optimal control

Next, we use the control u, corresponding to \(\epsilon =3\), as the initial guess for the control problem with \(\epsilon =2\). Further, using the control function resulting from this second run, we apply the PNCG scheme to solve the optimal control problem with \(\rho _0\) as described in (50). The plots of the solution at time \(t=T\) along with the optimal control are given in Fig. 6.

Fig. 6
figure 6

Case 1: PNCG solution \(\rho \) (left) at \(t=T\) with initial condition \(\rho _0\). Right: the optimal control

Next, we employ our TSNCU scheme to solve the same problem as above with initial condition \(\rho (x)\) and with the initial guess \(u(t)=1.5\). The TSNCU converges to the optimal solution with just 2 iterations. In Fig. 7, we depict the optimal solution obtained with the TSNCU scheme.

Fig. 7
figure 7

Case 1: TSNCU solution \(\rho \) (left) at \(t=T\) with initial condition \(\rho _0\). Right: the optimal control

We repeat this experiment on finer space-time grids, halving the previous space and time-grid sizes and also perform the same experiment with \(\rho ^\epsilon \) as the initial guess for \(\epsilon =0.01\). The same optimal control is obtained. In Fig. 7, notice that the TSNCU scheme is able to compute a solution where the control bounds are nowhere active and also when the initial condition is not smooth.

In our test Case 2, we discuss a case that results in bang-bang control. We consider \(b(x,t,u) = -2+ 4u\sin (\pi t)\), and choose \(T=2\) and \(B=\chi _{[3.5,4.5]}\). We have the box constraints \(u_a = -2\) and \(u_b = 2\) and discretize \(\Omega \) and (0, T) with grids of size 100.

In Fig. 8, we plot the results obtained by using the TSNCU scheme with initial condition given by \(\rho _0\). However, the same optimal control is obtained in the case of initial condition \(\rho _0^\epsilon \), with \(\rho _0\) in (50) and \(\epsilon =0.01\). We see that a bang-bang control with switching at \(t=1\) is obtained.

Fig. 8
figure 8

Case 2: TSNCU bang-bang solution \(\rho \) (left) at \(t=T\) with initial condition \(\rho _0\). Right: the optimal control

We also performed the same set of experiments with the PNCG scheme and, as in test Case 1, a continuation procedure is required to compute the optimal control with initial condition \(\rho _0\).

We remark that in the bang-bang case, the TSNCU scheme converges after 2 iterations and, with space-time grids of size \(100\times 100\), the CPU time is 11.73 s. On a space-time grid of size \(200\times 200\), the average CPU time is 45.84 s. On the other hand, to obtain the same solution, the PNCG scheme with continuation requires 138.77 s on the space-time grid of size \(200\times 200\) Clearly, the TSNCU scheme outperforms the PNCG scheme.

In test Case 3, we consider a vector field b which is a non-differentiable function of u, namely \(b(x,t,u) = -2+ 4|u|\sin (\pi t)\). We choose \(T=2\) and \(B=\chi _{[3.5,4.5]}\). We have the box constraints \(u_b = 2\) and discretize \(\Omega \) and (0, T) with grids of size 100. Notice that in this case, the PNCG scheme cannot be used as it requires derivative of b with respect to u. this is not a limitation for our TSNCU scheme. Further, notice that (43) requires to consider the absolute values of w. We choose the initial condition as \(\rho _0\) as in (50). Results for Case 3 experiment are reported in Fig. 9. As in Case 2, we obtain a switch at \(t=1\). However, because of the different control mechanism, the control becomes zero after the switching point. Moreover, the same optimal control is obtained in the case of initial condition \(\rho _0^\epsilon \), with \(\rho _0\) in (50) and \(\epsilon =0.01\).

Fig. 9
figure 9

Case 3: TSNCU bang-bang solution \(\rho \) (left) at \(t=T\) with initial condition \(\rho _0\). Right: the optimal control

7 Conclusion

In this paper, a class of Liouville optimal control problems was investigated with the purpose to design new, accurate and robust control strategies. On one hand, a discretization scheme for the governing Liouville equation was analyzed, proving stability, second-order accuracy, and positivity. On the other hand, an efficient numerical realisation of the Pontryagin’s maximum principle (PMP) was presented and validated by results of numerical experiments. In particular, it was pointed out that the PMP-based optimization scheme accomodates Liouville control problems that are not differentiable with respect to the control variable. This was illustrated by results of test case 3.