1 Introduction

In the framework of stochastic optimal control theory [9, 23, 24], given a stochastic process X(t) subject to a control function u, a control problem is defined by introducing a general objective functional to be minimized that has the following structure

$$\begin{aligned} J(X,u)={\mathbb {E}}\left[ \int _{t_0}^T G(X(t),t,u(X(t), t))\, dt + F(X(T)) \right] , \end{aligned}$$
(1)

where \({\mathbb {E}}[ \cdot ]\) represents the expectation with respect to the probability measure induced by the process X(t) (nevertheless, for clarity we explicitly write X in the integral). On the other hand, a fundamental tool for analysing stochastic processes is the fact that the evolution of the probability density function (PDF) associated to X(t) is governed by the so-called Fokker–Planck (FP) equation (or forward Kolmogorov equation), which is a time-dependent partial differential equation (PDE) with an initial PDF configuration; see, e.g., [6] and references therein. Thus, assuming that the stochastic process X(t) owns an absolutely continuous probability measure, one can explicitate the expectation in (1) in terms of the PDF governed by the FP problem.

Now, to illustrate these facts and the purpose of this work, and formulate the class of problems that we investigate in this paper, we introduce the following n-dimensional controlled Itō stochastic process

$$\begin{aligned} \left\{ \begin{array}{ll} d X(t) & = b(X(t), u(X(t),t)) dt + \sigma (X(t)) \, dW(t) , \qquad t\in (t_0,T]\\ X(t_0) & = X_0 , \end{array} \right. \end{aligned}$$
(2)

where the state variable \(X(t) \in \Omega \subseteq {\mathbb {R}}^n\) is subject to deterministic infinitesimal increments driven by the vector valued drift function b, and to random increments proportional to a multi-dimensional Wiener process \(dW(t) \in {\mathbb {R}}^m\), with stochastically independent components, and \(\sigma \) is the dispersion matrix coefficient. In this stochastic differential equation (SDE) modelling the stochastic process, we assume that the state configuration of the stochastic process at \(t_0\) is given by \(X_0\), and we suppose that the control function \(u \in {\mathcal {U}}\), where \({\mathcal {U}}\) represents the set of Markovian controls containing all jointly measurable functions u with \(u(x,t) \in K_U \subset {\mathbb {R}}^n\), and \(K_U\) is a compact set, which for simplicity is chosen as a subset of \({\mathbb {R}}^n\).

In application, the model (2) is of central importance in statistical physics, e.g., in the study of Brownian processes, and in the study of biological systems [5]. Recently, it has attracted attention in the framework of modelling pedestrians’ motion; see, e.g., [33,34,35] and references therein. Notice that in all these cases, a constant dispersion coefficient is usually considered.

At this point, we remark that there is a difference in our understanding of the control function depending on whether or not its functional dependence on the state of the process X(t) is given a-priori or is sought in order to solve the given control problem. In the former case, we have an open-loop control, and in the latter case we have a closed-loop control function, in the sense that a sudden change of the state of the process X(t) provides instantaneously (feedback) the optimal control for the new state configuration. For a discussion on the significance of these two settings, we refer to [15, 16] and the discussion that follows.

Corresponding to (2) and a closed-loop control setting, we consider the following functional

$$\begin{aligned} C_{t_0,x_0}(u ) = {\mathbb {E}}\left[ \int _{t_0}^{T} G(X(s),s, u (X(s),s) ) ds + F(X(T)) \;\;|\;\; X(t_0) =x_{0}\right] , \end{aligned}$$
(3)

which is a conditional expectation to the process X(t) taking the value \(x_0\) at time \(t_0\). We refer to the functions G and F as the running cost and the terminal cost functions, respectively; see the above references for more details.

The optimal control \({\bar{u}}\) that minimizes \(C_{t_0,x_0}(u )\) for the process (2) is given by

$$\begin{aligned} {\bar{u}} = {\text {argmin}}_{u \in {\mathcal {U}} } C_{t_0,x_0}(u ). \end{aligned}$$
(4)

Correspondingly, one defines the following value function

$$\begin{aligned} q(x,t) := \min _{u \in {\mathcal {U}}} C_{t,x}(u)= C_{t,x}({\bar{u}} ) . \end{aligned}$$
(5)

A fundamental result in stochastic optimal control theory is that the function q (subject to appropriate conditions) is the solution to the so-called Hamilton–Jacobi–Bellman (HJB) equation given by

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _t q + {\mathcal {H}}(x,t,D q, D^2 q) = 0,\\ q(x,T)= F(x), \end{array} \right. \end{aligned}$$
(6)

with the HJB Hamiltonian function

$$\begin{aligned} {\mathcal {H}}(x,t,D q,D^2 q)&:= \min _{v \in K_U} \Big [ G(x,t,v) + \sum _{i=1}^{n} b_i(x,v) \, \partial _{x_i} q(x,t) \nonumber \\&\quad + \sum _{i,j=1}^{n}a_{ij}(x)\, \partial ^2_{x_ix_j} q(x,t) \Big ] , \end{aligned}$$
(7)

where \(a_{ij}\) represents the ijth element of the matrix \(a=\sigma \, \sigma ^\top /2\). Notice that in our case, since the diffusion coefficient a does not depend on the control, the second-order differential term can be put outside the parenthesis in (7).

The HJB framework represents the essential tool to compute closed-loop controls. This framework poses the challenging task to analyse existence and uniqueness of solutions to the nonlinear HJB equation; see [20] for a fundamental work in this field. However, this task is facilitated in the case of uniform parabolicity that, in the simplest case, is guaranteed assuming that a is the identity matrix in \({\mathbb {R}}^n\) multiplied by a positive number \(\sigma >0\). This setting is considered later on to simplify our analysis.

We are now ready to introduce the FP equation for (2) and then formulate our optimal control problems while outlining the connection to the HJB framework mentioned above. We have

$$\begin{aligned}&\partial _t f (x,t) + \sum _{i=1}^{n}\partial _{x_i} \, ( b_{i}(x,u) \, f(x,t) ) - \sum _{i,j=1}^{n}\partial ^2_{x_ix_j}( a_{ij}(x) \, f(x,t) ) = 0 \end{aligned}$$
(8)
$$\begin{aligned}&f(x,t_0) = f_0(x) \end{aligned}$$
(9)

where f denotes the PDF of the stochastic process, \(f_0\) represents the initial PDF distribution of the initial state of the process \(X_0\), and hence we require \(f_0(x) \ge 0\) with \(\int _\Omega f_0(x) \, dx=1\).

In the following, both the stochastic process (2) and the FP equation (8) are considered in the time interval [0, T], i.e. \(t_0=0\). We consider the stochastic process in the convex domain \(\Omega \subset {\mathbb {R}}^n\), with measure \(|\Omega |\), in the sense that \(X(t) \in \Omega \) and, consequently, \(\Omega \) is the domain where f is defined. We assume that the boundary \(\partial \Omega \) is Lipschitz, and denote with \(Q:=\Omega \times \left( 0,T\right) \) the space-time cylinder.

One can see that the coefficients of the FP equation are directly determined by the coefficients of the SDE. Moreover, the choice of the barriers that limit the value of X(t) in \(\Omega \) translate to boundary conditions for the PDF. Specifically, in the case of absorbing barriers, we have homogeneous Dirichlet boundary conditions for f on \(\partial \Omega \), \(t \in [0,T]\). On the other hand, reflecting barriers correspond to flux-zero boundary conditions. In fact, notice that (8) can be written in the form \(\partial _ t f = \nabla \cdot {\mathcal {F}}(f)\), where the ith component of the flux \({\mathcal {F}}\) is given by \({\mathcal {F}}_i (f)= \sum _{j=1}^{n}\partial _{x_j}( a_{ij}(x) \, f(x,t) ) - b_{i}(x,u) \, f(x,t) \); thus, reflecting barriers require \({\mathcal {F}}(f) \cdot \nu = 0\), where \(\nu \) represents the outward normal to \(\partial \Omega \). For reasons that are explained below, later on we focus on Dirichlet boundary conditions, and, in correspondence to this choice, we discuss existence and regularity of solutions to (8)–(9), and the properties of the control-to-state map \(u \mapsto f=f(u)\).

However, for both choices of boundary conditions, we can make the following discussion that aims at clarifying our framework. Let us assume that \(f _0(x)=\delta (x-x_0)\) (the Dirac’s delta) at \(t=t_0\) fixed, and notice that the expectation in (3) can be explicitly written in terms of the PDF solving the FP problem with the given initial density distribution. Thus, the functional (3) becomes

$$\begin{aligned} J(f (u ),u ):=\int _{t_0}^{T}\int _{\Omega } G(x,s,u (x,s) ) \, f (x,s) \, ds\, dx + \int _{\Omega } F(x) f (x,T)\, dx. \end{aligned}$$
(10)

Therefore the optimization problem (4) can be equivalently stated as a FP optimal control problem where a function u in the space \({\mathcal {U}}\) is sought that minimizes (10).

Now, in the Lagrange framework [38] and assuming Fréchet differentiability of all components of the FP optimal control problem, we can derive the following first-order necessary optimality conditions that characterize a solution to the FP optimal control problem. We have

$$\begin{aligned}&\partial _t f (x,t)+\sum _{i=1}^n\partial _{x_i} (b_i(x,u (x,t)) \, f (x,t)) \nonumber \\&\quad - \sum _{ij=1}^n \partial _{x_ix_j} ( a_{ij}(x)f (x,t)) = 0 ,\nonumber \\&\quad f (x,t_0) = f _0(x) , \end{aligned}$$
(11)
$$\begin{aligned}&\partial _t p(x,t)+ \sum _{i=1}^n b_i(x,u (x,t)) \, \partial _{x_i} p(x,t) \nonumber \\&\quad + \sum _{ij=1}^n a_{ij}(x)\partial _{x_ix_j} p(x,t) + G(x,u (x,t))= 0 ,\nonumber \\&\quad p(x,T) = F(x), \end{aligned}$$
(12)

and

$$\begin{aligned}&\int _{t_0}^{T}\int _{\Omega } \left( f(x,t) \left( \sum _{i=1}^n \partial _u b_i(x,u (x,t)) \, \partial _{x_i} p(x,t) + \partial _u G(x,u (x,t)) \right) \right) \nonumber \\&\quad \cdot \left( v (x,t)-u (x,t) \right) dt \, dx \ge 0 , \end{aligned}$$
(13)

for all \(v \in {\mathcal {U}}\). This optimality system is completed with the specification of the boundary conditions. We have homogeneous Dirichlet boundary conditions for p if such are the boundary conditions for f. In the case of zero-flux boundary conditions for f, then variational calculus gives homogeneous Neumann boundary conditions for the adjoint variable.

We remark that, if a uniformly parabolicity condition for the FP equation holds, then the solution of the FP problem, with \(f_0(x) \ge 0\) (strictly in some open set, and also in the case \(f_0(x)=\delta (x-x_0)\)) and with the given boundary- and initial conditions, remains positive in the sense that \(f(x,t) >0 \) for \(t > t_0\) and almost everywhere in \(\Omega \). Based on this fact, we see that (13) represents the first-order optimality condition for the minimization problem for the control function in (7). This remark is the starting point to establish a formal connection between the HJB equation and the adjoint equation (12) at optimality [6, 7]. This connection has already been used within a Lagrangian framework to construct closed-loop controls for different application problems [33, 37].

However, comparison of (7) with (13), shows that differentiability with respect to u is not required in the HJB problem, and the appropriate theoretical optimization framework to establish the HJB–FP optimal control connection appears to be provided by the Pontryagin’s maximum principle (PMP). This also implies that a consistent numerical optimization procedure to solve FP optimal control problems (and thus HJB problems) should be formulated in terms of the PMP framework.

In this paper, we would like to contribute to the investigation of both issues by pursuing a theoretical analysis of two specific FP optimal control problems in the PMP framework, and by developing a numerical PMP-based methodology. For both aims, we rely on our previous work in [13, 14] and on the fundamental references [27, 30], while our numerical PMP-based approach already proposed in [13, 14] represents a further development of previous methods developed in the field of optimal control of ordinary differential equation models [25, 36].

Now, to explain the challenge of our work, we anticipate that the necessary optimality conditions provided by the PMP consist of the FP equation (11), the adjoint equation (12), and the condition

$$\begin{aligned} H\left( x,t,{\bar{f}}(x,t),{\bar{u}}(x,t),\nabla {\bar{p}}(x,t)\right) =\min _{v\in K_{U}}H\left( x,t,{\bar{f}}(x,t),v,\nabla {\bar{p}}(x,t)\right) , \end{aligned}$$
(14)

for almost all \(\left( x,t\right) \in Q\), where the PMP Hamiltonian function is given by

$$\begin{aligned} H\left( x,t,f,v,\zeta \right) :=\left( G\left( x,t,v\right) + b(x,v) \cdot \zeta \right) \, f . \end{aligned}$$

In (14), the pair \(\left( {\bar{f}},{\bar{u}}\right) \) denotes the solution of the FP optimal control problem, and \({\bar{p}}\) is the corresponding adjoint variable. We say that (11), (12), and (14) provide the PMP characterization of the solution to our FP optimal control problem. (We formulate (14) in terms of a minimum for convenience; however, an equivalent formulation in terms of maximization of H could be chosen.)

Notice that the terminal conditions of our FP adjoint problem and of the HJB problem above are identical, and whenever \(f(x,t) >0\) the minimizer \({\bar{u}}(x,t)\) of H at (xt), coincides with that of (7), and correspondingly we have

$$\begin{aligned} {\mathcal {H}}(x,t,D p,D^2 p) = G(x,t,{\bar{u}}) + \sum _{i=1}^{n} b_i(x,{\bar{u}}) \, \partial _{x_i} p(x,t) + \sum _{i,j=1}^{n}a_{ij}(x)\, \partial ^2_{x_ix_j} p(x,t) . \end{aligned}$$

Therefore at optimality, the adjoint equation can be written as \(\partial _t p +{\mathcal {H}}(x,t,D p,D^2 p) =0\), which allows to identify p with the value function q. Notice that by the very notion of absorbing boundary for the stochastic process, we have that in this case the value function, which now can be identified with p, must be zero at the boundary of the domain \(\Omega \).

However, although the HJB–FP connection is clear at a formal level, we have to guarantee that all components of the PMP optimality system are well defined. Specifically, on the one hand we have to discuss existence and regularity of solutions to (11) and (12); on the other hand, we need to guarantee the well posedness of (14) and to provide a methodology to implement it. These requirements are the main challenges that we face in this work.

We remark that, in order to achieve the above mentioned goals, some \(L^\infty \) estimates on the FP solution f and on the adjoint variable p are required. The latter is needed in the proof of the PMP characterization of an optimal control; see also [30], whereas an additional \(L^\infty \) estimate for f is required when we analyse well-posedness of our PMP-based optimization method.

Notice that these estimates are usually not considered in existing works focusing on a Lagrange framework for FP optimal control problems; see [6] for a list of these references. Indeed, these estimates are available in the case of PDEs with a linear control mechanism [13, 14], but not in the FP case where the control (drift) multiplies f and is subject to differentiation. In fact, proving these estimates is a delicate issue that we address in the following sections, and for this purpose we make some assumptions that allow us to focus on the most relevant problems. In particular, we assume that the diffusion coefficient \(a=\sigma \, \sigma ^\top /2\) is the identity in \({\mathbb {R}}^n\) multiplied by a scalar \(\sigma >0\) (we use the same symbol), and we choose homogeneous Dirichlet boundary conditions for the PDF. The case of flux-zero boundary conditions requires a different analysis especially concerning the already mentioned \(L^\infty \) estimates and, in order to keep this paper at a reasonable size, it is not discussed in this paper.

We focus, on a specific open-loop control structure and, on the other hand, on a closed-loop setting, having in mind that in application an open-loop control is usually much easier to implement than a closed-loop one, while the latter provides, in principle, the optimal control ‘per antonomasia’. Our choice of a specific open-loop control structure is motivated by the discussion in [15, 16] in the framework of ensemble controls, where it is pointed out that a composite linear–bilinear open-loop control mechanism in the SDE may provide a reasonable approximation of a closed-loop control; we investigate this fact with numerical experiments. We also refer to [15, 16] and the recent work [8] concerning the choice of the functions G and F in the cost functional. These functions will be specified and discussed below.

The other challenge that we face in our work is the numerical solution of our FP optimal control problems within the PMP framework. This is a main focus of our research, and in previous works we have proposed the so-called sequential quadratic Hamiltonian (SQH) scheme to solve nonsmooth elliptic and parabolic optimal control problems with linear and bilinear control mechanisms [13, 14]. However, the bilinear control structure of the FP equation poses additional difficulties that we address in this work. Furthermore, the convergence analysis of the SQH scheme in [13, 14] needs to be extended to accommodate the FP structure, and this requires additional estimates that we present below.

As already mentioned, our theoretical and numerical investigation focuses on the following two stochastic processes. In the first case, we take \(b(x,u)= \left( v+w \circ x\right) \), where the control \(u=b\) is also identified with the pair (vw), \(v,w:[0,T]\rightarrow {\mathbb {R}}^{n}\); it represents our open-loop control function. This is a linear–affine control-drift for the SDE where x appears linearly and is modulated by w. In this case, our controlled stochastic process is modelled as follows

$$\begin{aligned} dX (t)=\left( v(t)+w(t) \circ X(t)\right) \, dt+\sigma \, dW(t), \end{aligned}$$
(15)

where \(\circ \) denotes the Hadamard product.

Our second SDE model is given by

$$\begin{aligned} dX (t)= u \left( X(t),t\right) \, dt+\sigma \, dW(t), \end{aligned}$$
(16)

where the control function \(u : \Omega \times [0,T]\rightarrow {\mathbb {R}}^{n}\) is intended to define a closed-loop control mechanism for the stochastic process. (In this case, the dependence of the control function on x has to be determined.)

Corresponding to these two cases, we consider FP optimal control problems that require to minimize (10) subject to the differential constraint given by (8)–(9), and \(t_0=0\) is fixed.

In the next section, we investigate the FP equation and its adjoint concerning the required \(L^\infty \) estimates, considering both control strategies given above. In Sect. 3, we present the PMP characterization of our FP control problems. In Sect. 4, we illustrate the numerical approximation and PMP-based optimization procedure for solving the proposed optimal control problems. For this purpose, we illustrate the numerical approximation of the FP PMP optimality system, using the Chang–Cooper scheme [34] combined with implicit first- and second-order Euler schemes. These schemes provide accurate and positivity preserving approximations that are essential in the FP computation. Further, in this section we present the SQH method and discuss its convergence properties. Also in this section, we discuss a modification of the SQH procedure that implements the PMP optimality in the case of feedback controls that resembles the HJB approach. In Sect. 5, we present results of numerical experiments that demonstrate the ability of the FP framework to determine the two different control strategies for stochastic processes. In particular, we present results of Monte Carlo simulations that show the ability of the control mechanisms to drive the ensemble of stochastic paths along a desired trajectory. A conclusion completes this work.

2 Analysis of the FP equation and of its adjoint

We start this section, providing the weak formulation of our FP problem (8)–(9) with homogeneous Dirichlet boundary conditions. We have

$$\begin{aligned} \begin{aligned}&\int _{\Omega }\left( f^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla f\left( x,t\right) \cdot \nabla \varphi \left( x\right) +\nabla \cdot \left( b(x,t) \, f\left( x,t\right) \right) \varphi \left( x\right) \right) dx=0,\\&f\left( \cdot ,0\right) =f_{0}, \end{aligned} \end{aligned}$$
(17)

for all \(\varphi \in H_{0}^{1}\left( \Omega \right) \) and for almost all \(t\in \left( 0,T\right) \) where the dot \(\cdot \) denotes the Euclidean scalar product in \({\mathbb {R}}^n\), \(\nabla \) denotes the gradient in \({\mathbb {R}}^n\), the divergence of a vector-valued function \(y=\left( y^{1}, \ldots , y^{n}\right) ^T\) is given by \(\nabla \cdot y\), and the partial derivative with respect to t is denoted with \(f^{\prime}:=\frac{\partial }{\partial t}f\).

We remark that for any function \(y\in \left( L^{q}\left( 0,T\right) \right) ^{n}\), we have \(\Vert y\Vert _{L^{q}\left( 0,T\right) }^{q}:=\sum _{i=1}^{n}\Vert y^{i}\Vert _{L^{q}\left( 0,T\right) }^{q}\), and analogously for any function \(y\in \left( L^{\infty }\left( 0,T\right) \right) ^{n}\), we have \(\Vert y\Vert _{L^{\infty }\left( 0,T\right) }:=\max _{i=1,\ldots,n}\Vert y^{i}\Vert _{L^{\infty }\left( 0,T\right) }\). We assume that \(\Omega \subset {\mathbb {R}}^n\) is bounded and convex, and \(q>\frac{n}{2}+1\) for \(n\ge 2\) and \(q\ge 2\) for \(n=1\). Further, \(\left( \cdot ,\cdot \right) \) denotes the \(L^{2}\left( \Omega \right) \)-scalar product.

We first consider the case of a drift function \(b(x,t) = \left( v(t)+w(t) \circ x\right) \), thereafter we focus on the case \(b(x,t) =u(x,t)\). We anticipate that, in our FP control problems, these functions are sought in the following sets of admissible controls. For the former case, we have

$$\begin{aligned} V_{ad}:=V_{ad}^{1}\times \cdots\times V_{ad}^{n} \; {\mathrm {\ and\ }}\; W_{ad}:=W_{ad}^{1}\times \cdots \times W_{ad}^{n} , \end{aligned}$$

where

$$\begin{aligned} V_{ad}^{i}:=\left\{ v\in L^{q}\left( 0,T\right) |\ v\left( t\right) \in K_{V}^{i}\ {\mathrm {a.e.\ in\ }}\left( 0,T\right) \right\} , \end{aligned}$$

and

$$\begin{aligned} W_{ad}^{i}:=\left\{ w\in L^{q}\left( 0,T\right) |\ w\left( t\right) \in K_{W}^{i}\ {\mathrm {a.e.\ in\ }}\left( 0,T\right) \right\} , \end{aligned}$$

where \(i\in \left\{ 1,\ldots,n\right\} \), and \(K_{V}^{i}\), \(K_{W}^{i}\) are compact subsets of \({\mathbb {R}}\). Hence, we have that

$$\begin{aligned} K_{V}:=K_{V}^{1}\times \cdots \times K_{V}^{n}{\mathrm {\ and\ }}K_{W}:=K_{W}^{1}\times \cdots \times K_{W}^{n} . \end{aligned}$$

For the latter case, we choose the following admissible set of controls

$$\begin{aligned} U_{ad}:=\left\{ u\in \left( L^{q}\left( Q\right) \right) ^{n}|\ u\left( x,t\right) \in K_{U}\ {\mathrm {a.e.\ on\ }}Q\right\} , \end{aligned}$$

where \(K_U\) represents a compact subset. However, to ease our discussion and simplify notation, we consider the case \(K_{U}:=\left[ u_{\min },u_{\max }\right] ^{n}\), with \(u_{\min },u_{\max }\in {\mathbb {R}}\), \(u_{\min } < u_{\max }\).

2.1 The ‘open-loop’ case

We consider the case of the drift \(b(x,u)= \left( v+w \circ x\right) \). Since this drift is differentiable with respect to x, we have

$$\begin{aligned} \begin{aligned}&\nabla \cdot \left( \left( v\left( t\right) +x\circ w\left( t\right) \right) f\left( x,t\right) \right) =\sum _{i=1}^{n}\frac{\partial }{\partial x_{i}}\left( \left( v^{i}\left( t\right) +x_{i}w^{i}\left( t\right) \right) f\left( x,t\right) \right) \\&\quad =\sum _{i=1}^{n}\left( v^{i}\left( t\right) +x_{i}w^{i}\left( t\right) \right) \frac{\partial }{\partial x_{i}}f\left( x,t\right) +w^{i}\left( t\right) f\left( x,t\right) \\&\quad =\left( v\left( t\right) +x\circ w\left( t\right) \right) \cdot \nabla f\left( x,t\right) +\sum _{i=1}^{n}w^{i}\left( t\right) f\left( x,t\right) . \end{aligned} \end{aligned}$$
(18)

The next theorem states a specific boundedness result that is required in our PMP framework.

Theorem 1

Consider the following parabolic problem

$$\begin{aligned} \begin{aligned}&\left( y^{\prime},\varphi \right) +a\left( \nabla y,\nabla \varphi \right) +\left( b\cdot \nabla y,\varphi \right) +\left( cy,\varphi \right) =\left( h,\varphi \right)&\qquad {\mathrm { in\ }}\Omega \times \left( 0,T\right) \\&y=0 \qquad {\mathrm { on\ }}\partial \Omega \times \left[ 0,T\right] \\&y=y_{0} \qquad {\mathrm { on\ }}\Omega \times \left\{ 0\right\} \end{aligned} , \end{aligned}$$
(19)

for all\(\varphi \in H_{0}^{1}\left( \Omega \right) \)and almost all\(t\in \left( 0,T\right) \); let\(a,T>0\), \(b\in \left( L^{\infty }\left( Q\right) \right) ^{n}\), \(c\in L^{\infty }\left( Q\right) \), \(y_{0}\in L^{\infty }\left( \Omega \right) \)and\(h\in L^{q}\left( Q\right) \). Then there exists a unique solution\(y\in L^{2}\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \cap L^{\infty }\left( 0,T;L^{2}\left( \Omega \right) \right) \)to (19), and it satisfies the following

$$\begin{aligned} \Vert y\Vert _{L^{\infty }\left( Q\right) }\le C\left( \Vert h\Vert _{L^{q}\left( Q\right) }+\Vert y_{0}\Vert _{L^{\infty }\left( \Omega \right) }\right) , \end{aligned}$$
(20)

where\(C:=C\left( \Omega ,a,T,\Vert b\Vert _{L^{\infty }\left( Q\right) },\Vert c\Vert _{L^{\infty }\left( Q\right) }\right) >0\).

Proof

The existence and uniqueness of the solution \(y\in L^{2}\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \cap L^{\infty }\left( 0,T;L^{2}\left( \Omega \right) \right) \) is proved in [21]. Our concern is to prove (20), for which purpose we use results in [14, Theorem A.1].

Now, we define the bilinear map

$$\begin{aligned} B\left( y,\varphi ;t\right)&:= \int _{\Omega }a\nabla y\left( x,t\right) \cdot \nabla \varphi \left( x\right) +b\left( x,t\right) \cdot \nabla y\left( x,t\right) \varphi \left( x\right) \\&\quad+c\left( x,t\right) y\left( x,t\right) \varphi \left( x\right) dx. \end{aligned}$$

In order to apply [14, Theorem A.1], we construct an auxiliary problem where the corresponding bilinear map fulfils the coercivity condition. For this purpose, we set \({\hat{y}}\left( x,t\right) :={\mathrm {e}}^{-\eta t}y\left( x,t\right) \) for any \(\eta \ge 0\) where y solves (19). Then, we multiply both sides of the equation in (19) with \({\mathrm {e}}^{-\eta t}\) and obtain

$$\begin{aligned} \int _{\Omega }{\mathrm {e}}^{-\eta t}y^{\prime}\left( x,t\right) \varphi \left( x\right) dx+{\mathrm {e}}^{-\eta t}B\left( y\left( \cdot ,t\right) ,\varphi \right) =\int _{\Omega }{\mathrm {e}}^{-\eta t}h\left( x,t\right) \varphi \left( x\right) dx, \end{aligned}$$

with this result, and by inserting the definition of \({\hat{y}}\), we obtain the following

$$\begin{aligned}&\int _{\Omega }{\hat{y}}^{\prime}\left( x,t\right) \varphi \left( x\right) dx+B\left( {\hat{y}}\left( \cdot ,t\right) ,\varphi ;t\right) +\int _{\Omega }\eta {\hat{y}}\left( x,t\right) \varphi \left( x\right) dx\nonumber \\&\quad =\int _{\Omega }{\hat{h}}\left( x,t\right) \varphi \left( x\right) dx, \end{aligned}$$
(21)

where \({\hat{h}}\left( x,t\right) :=h\left( x,t\right) {\mathrm {e}}^{-\eta t}\in L^{q}\left( Q\right) \), because of the boundedness of \(t\mapsto {\mathrm {e}}^{-\eta t}\) over \(\left[ 0,T\right] \). Now, from (21), we have that

$$\begin{aligned} \int _{\Omega }{\hat{y}}^{\prime}\left( x,t\right) \varphi \left( x\right) dx+{\hat{B}}\left( {\hat{y}}\left( \cdot ,t\right) ,\varphi \right) =\int _{\Omega }{\hat{h}}\left( x,t\right) \varphi \left( x\right) dx, \end{aligned}$$
(22)

which is uniquely solvable with \({\hat{y}}=0\ {\mathrm {on\ }}\partial \Omega \times \left[ 0,T\right] \) and \({\hat{y}}={\mathrm {e}}^{-\eta 0}y_{0}=y_{0}\ {\mathrm {on\ }}\Omega \times \left\{ 0\right\} \) where

$$\begin{aligned} {\hat{B}}\left( {\hat{y}},\varphi ;t\right) :=B\left( {\hat{y}}\left( \cdot ,t\right) ,\varphi ;t\right) +\int _{\Omega }\eta {\hat{y}}\left( x,t\right) \varphi \left( x\right) dx, \end{aligned}$$

see [21, Section 7.1 Theorem 3] with \({\hat{y}}\in L^{2}\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \cap L^{\infty }\left( 0,T;L^{2}\left( \Omega \right) \right) \) since \(t\mapsto {\mathrm {e}}^{-\eta t}\) is bounded over \(\left[ 0,T\right] \). Then we have the following result

$$\begin{aligned} \begin{aligned}&a\Vert {\hat{y}}\left( \cdot ,t\right) \Vert _{H_{0}^{1}\left( \Omega \right) }^{2}\\&\quad =a\int _{\Omega }\nabla {\hat{y}}\left( x,t\right) \cdot \nabla {\hat{y}}\left( x,t\right) dx\\&\quad ={\hat{B}}\left( {\hat{y}},{\hat{y}};t\right) -\int _{\Omega }b\left( x,t\right) \cdot \nabla {\hat{y}}\left( x,t\right) {\hat{y}}\left( x,t\right) +\left( c\left( x,t\right) +\eta \right) {\hat{y}}^{2}\left( x,t\right) dx. \end{aligned} \end{aligned}$$
(23)

From (23), we obtain

$$\begin{aligned} \begin{aligned}&a\Vert {\hat{y}}\left( \cdot ,t\right) \Vert _{H_{0}^{1}\left( \Omega \right) }^{2}+\int _{\Omega }\eta {\hat{y}}^{2}\left( x,t\right) dx\\&\quad ={\hat{B}}\left( {\hat{y}},{\hat{y}};t\right) -\int _{\Omega }b\left( x,t\right) \cdot \nabla {\hat{y}}\left( x,t\right) {\hat{y}}\left( x,t\right) +c\left( x,t\right) {\hat{y}}^{2}\left( x,t\right) dx\\&\quad \le {\hat{B}}\left( {\hat{y}},{\hat{y}};t\right) +\Vert b\Vert _{L^{\infty }\left( Q\right) }\left( \epsilon \int _{\Omega }\nabla {\hat{y}}\left( x,t\right) \cdot \nabla {\hat{y}}\left( x,t\right) dx+\frac{1}{4\epsilon }\int _{\Omega }{\hat{y}}^{2}\left( x,t\right) dx\right) \\&\qquad +\Vert c\Vert _{L^{\infty }\left( Q\right) }\int _{\Omega }{\hat{y}}^{2}\left( x,t\right) dx, \end{aligned} \end{aligned}$$
(24)

with

$$\begin{aligned} \begin{aligned}&\left|\int _{\Omega }b\left( x,t\right) \cdot \nabla {\hat{y}}\left( x,t\right) {\hat{y}}\left( x,t\right) dx\right|\le \Vert b\Vert _{L^{\infty }\left( Q\right) }\sum _{i=1}^{n}\int _{\Omega }\left|\frac{\partial }{\partial x_{i}}{\hat{y}}_{i}\left( x,t\right) \right| \left|{\hat{y}}\left( x,t\right) \right|dx\\&\quad \le \Vert b\Vert _{L^{\infty }\left( Q\right) }\sum _{i=1}^{n}\left( \int _{\Omega }\epsilon \left|\frac{\partial }{\partial x_{i}}{\hat{y}}_{i}\left( x,t\right)\right |^{2}+\frac{1}{4\epsilon }\left|{\hat{y}}\left( x,t\right)\right |^{2}\right) dx\\&\quad =\Vert b\Vert _{L^{\infty }\left( Q\right) }\int _{\Omega }\epsilon \nabla {\hat{y}}\left( x,t\right) \cdot \nabla {\hat{y}}\left( x,t\right) +\frac{n}{4\epsilon }|{\hat{y}}\left( x,t\right) |^{2}dx, \end{aligned} \end{aligned}$$

where we use the Cauchy inequality, see [21, page 622], for \(\epsilon >0\).

We assume that \(\Vert b\Vert _{L^{\infty }\left( Q\right) }\ne 0\) and choose \(\epsilon :=\frac{a}{2\Vert b\Vert _{L^{\infty }\left( Q\right) }}\). From (24), we have that

$$\begin{aligned}&\frac{a}{2}\Vert {\hat{y}}\left( \cdot ,t\right) \Vert _{H_{0}^{1}\left( \Omega \right) }^{2} +\int _{\Omega }\eta {\hat{y}}^{2}\left( x,t\right) dx \le {\hat{B}}\left( {\hat{y}},{\hat{y}};t\right) \\&\quad +\frac{\Vert b\Vert _{L^{\infty }\left( Q\right) }^{2} +2a\Vert c\Vert _{L^{\infty }\left( Q\right) }}{2a}\int _{\Omega }{\hat{y}}^{2}\left( x,t\right) dx, \end{aligned}$$

which gives

$$\begin{aligned} a\Vert {\hat{y}}\left( \cdot ,t\right) \Vert _{H_{0}^{1}\left( \Omega \right) }^{2}\le {\hat{B}}\left( {\hat{y}},{\hat{y}};t\right) , \end{aligned}$$
(25)

for \(\eta \ge \frac{\Vert b\Vert _{L^{\infty }\left( Q\right) }^{2}+2a\Vert c\Vert _{L^{\infty }\left( Q\right) }}{2a}\). If \(\Vert b\Vert _{L^{\infty }\left( Q\right) }=0\), then from (24) we obtain that (25) holds for \(\eta \ge \Vert c\Vert _{L^{\infty }\left( Q\right) }\). Consequently, we choose

$$\begin{aligned} \eta \ge \frac{\Vert b\Vert _{L^{\infty }\left( Q\right) }^{2}+2a\Vert c\Vert _{L^{\infty }\left( Q\right) }}{2a}. \end{aligned}$$

Since it holds that \({\hat{B}}\left( -k,\varphi ;t\right) \le 0\) for \(k\ge 0\) if \(\varphi \ge 0\) for any \(t\in \left[ 0,T\right] \), we can apply [14, Theorem A.1] to the following

$$\begin{aligned} \begin{aligned} \left( {\hat{y}}^{\prime},\varphi \right) +{\hat{B}}\left( {\hat{y}},\varphi ;t\right)&=\left( {\hat{h}},\varphi \right) \ {\mathrm {in\ }}\Omega \times \left( 0,T\right) \\ {\hat{y}}&=0 \qquad {\mathrm {on\ }}\partial \Omega \times \left[ 0,T\right] \\ {\hat{y}}&=y_{0} \qquad {\mathrm {on\ }}\Omega \times \left\{ 0\right\} \end{aligned} , \end{aligned}$$

and obtain

$$\begin{aligned} \Vert {\hat{y}}\Vert _{L^{\infty }\left( Q\right) }\le {\hat{C}}\Vert {\hat{h}}\Vert _{L^{\infty }\left( Q\right) }+\Vert y_{0}\Vert _{L^{\infty }\left( \Omega \right) }, \end{aligned}$$
(26)

for a constant \({\hat{C}}>0\). Thus, from (26) we have

$$\begin{aligned} \begin{aligned}\Vert y\Vert _{L^{\infty }\left( Q\right) }&=\Vert {\mathrm {e}}^{\eta \cdot }{\hat{y}}\Vert _{L^{\infty }\left( Q\right) }\le {\mathrm {e}}^{\eta T}\Vert {\hat{y}}\Vert _{L^{\infty }}\le {\mathrm {e}}^{\eta T}{\hat{C}}\Vert {\hat{h}}\Vert _{L^{2}\left( Q\right) }+{\mathrm {e}}^{\eta T}\Vert y_{0}\Vert _{L^{\infty }\left( \Omega \right) }\\& \le {\hat{C}}{\mathrm {e}}^{\eta T}\Vert h\Vert _{L^{2}\left( Q\right) }+{\mathrm {e}}^{\eta T}\Vert y_{0}\Vert _{L^{\infty }\left( \Omega \right) }, \end{aligned} \end{aligned}$$

where \(C:=\max \left( {\hat{C}}{\mathrm {e}}^{\eta T},{\mathrm {e}}^{\eta T}\right) \). \(\square \)

In the framework of Theorem 1, and with \(v\in V_{ad}\) and \(w\in W_{ad}\), the FP equation (17) with \(f_{0}\in L^{2}\left( \Omega \right) \) is uniquely solvable for \(f\in L^{2}\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \) with \(f^{\prime}\in L^{2}\left( 0,T;H^{-1}\left( \Omega \right) \right) \), see [1, Theorem 2.14], [21, Section 7.1, Theorem 3 and Theorem 4]. However, to obtain the desired regularity, we require \(f_{0}\in L^{\infty }\left( \Omega \right) \cap H_{0}^{1}\left( \Omega \right) \) such that we have \(f\in L^{2}\left( 0,T;H^{2}\left( \Omega \right) \right) \cap L^{\infty }\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \), see [21, Section 7.1 Theorem 5] where the corresponding part of the proof also holds for our case to prove the desired regularity. From these results and Theorem 1, we prove the following theorem; compare with [10].

Theorem 2

Let\(f_{0}\in L^{\infty }\left( \Omega \right) \cap H_{0}^{1}\left( \Omega \right) \). Then the solution to the FP problem (17) satisfies the following

$$\begin{aligned} \Vert f\Vert _{L^{\infty }\left( Q\right) }\le C \, \Vert f_{0}\Vert _{L^{\infty }\left( \Omega \right) }, \end{aligned}$$
(27)

where\(C:=C\left( \Omega ,\sigma ,T,\Vert v+x\circ w\Vert _{L^{\infty }\left( Q\right) },\Vert \sum _{i=1}^{n}w^{i}\Vert _{L^{\infty }\left( Q\right) }\right) >0\).

Proof

From the assumption \(f_{0}\in L^{\infty }\left( \Omega \right) \cap H_{0}^{1}\left( \Omega \right) \) and the discussion before Theorem 2, we have that \(f\in L^{2}\left( 0,T;H^{2}\left( \Omega \right) \right) \cap L^{\infty }\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \). In view of (18), our FP equation takes the form of (19) when we choose \(b\left( x,t\right) =v\left( t\right) +x\circ w\left( t\right) \) and \(c=\sum _{i=1}^{n}w^{i}\left( t\right) \). Then we apply Theorem 1 and obtain the desired result. \(\square \)

Next, we discuss the adjoint FP problem given by

$$\begin{aligned} \begin{aligned}&\int _{\Omega }\left( -p^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla p\left( x,t\right) \cdot \nabla \varphi \left( x\right) -\left( v\left( t\right) +x\circ w\left( t\right) \right) \cdot \nabla p\left( x,t\right) \varphi \left( x\right) \right) dx\\&\quad =\int _{\Omega }\left( G\left( x,t,v,w\right) \left( x,t\right) \varphi \left( x\right) \right) dx, \\&\qquad p\left( \cdot ,T\right) =F\left( \cdot \right) , \end{aligned} \end{aligned}$$
(28)

for all \(\varphi \in H_{0}^{1}\left( \Omega \right) \).

We assume that, for any \(v\in V_{ad}\) and \(w\in W_{ad}\), it holds \(G\left( \cdot , \cdot , v,w\right) \in L^{\infty }\left( Q\right) \), further we assume \(F\in L^{\infty }\left( \Omega \right) \cap H_{0}^{1}\left( \Omega \right) \), and consider the time transformation \(t=T-{\tilde{t}}\). Thus, we have the existence of a unique solution to (28) analogously to (17). Furthermore, by the proof of [21, Section 7.1 Theorem 5], we have that \(p\in L^{2}\left( 0,T;H^{2}\left( \Omega \right) \right) \cap L^{\infty }\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \) and \(p^{\prime}\in L^{2}\left( 0,T;L^{2}\left( \Omega \right) \right) \). Therefore, we have the following theorem.

Theorem 3

For the solution to (28), it holds

$$\begin{aligned} \Vert p\Vert _{L^{\infty }\left( Q\right) }\le C\left( \Vert G\left( \cdot , \cdot , v,w\right) \Vert _{L^{q}\left( Q\right) }+\Vert F\Vert _{L^{\infty }\left( \Omega \right) }\right) , \end{aligned}$$

for\(C:=C\left( \Omega ,\sigma ,T,\max _{i=1,\ldots,n}\Vert v^{i}+x_{i}w^{i}\Vert _{L^{\infty }\left( Q\right) }\right) >0\).

Proof

As \(F\in L^{\infty }\left( \Omega \right) \cap H_{0}^{1}\left( \Omega \right) \), and due to the pointwise boundedness of vw and \(\left( v^{i}\left( t\right) +x_{i}w^{i}\left( t\right) \right) \in L^{\infty }\left( Q\right) \) for all \(i\in \left\{ 1,\ldots,n\right\} \), we can apply Theorem 1 to obtain the desired result. \(\square \)

Additionally, we have that \(p\in L^{q}\left( 0,T;W_{0}^{1,q}\right) \), see [11] and [32, Proposition 8.35]. Notice that in the adjoint FP problem (28) the solution of the forward FP problem does not appear. This is due to the linearity of our cost functional with respect to the PDF.

2.2 The ‘closed-loop’ case

In the case where \(u\in \left( L^{q}\left( Q\right) \right) ^{n}\), the well posedness of the FP problem with homogeneous Dirichlet boundary condition is discussed in [5, 22], and also in this case, a \(L^{\infty }\) bound for the PDF, analogous to Theorem 2, can be shown based on [10, Theorem 3.1]. As discussed in detail below, the adjoint FP problem for this case is given by

$$\begin{aligned} \begin{aligned}&\int _{\Omega }\left( -p^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla p\left( x,t\right) \cdot \nabla \varphi \left( x\right) -u\left( x,t\right) \cdot \nabla p\left( x,t\right) \varphi \left( x\right) \right) dx\\&\quad =\int _{\Omega }\left( G\left( x,t,u\right) \left( x,t\right) \varphi \left( x\right) \right) dx, \\&\quad p\left( \cdot ,T\right) =F\left( \cdot \right) , \end{aligned} \end{aligned}$$
(29)

for all \(\varphi \in H_{0}^{1}\left( \Omega \right) \). Further, since \(u\in U_{ad} \subset L^{\infty }\left( Q\right) \), we can apply Theorem 1 to obtain a \(L^{\infty }\) bound for the solution of the adjoint problem that is analogous to that of Theorem 3. For brevity, and to keep this paper at a reasonable size, we avoid to put these statements in the form of theorems. Notice that, also in this case, in the adjoint FP problem the PDF does not appear.

3 Analysis of FP optimal control problems

In this section, we discuss our optimal control problems governed by the FP model (8)–(9) with homogeneous Dirichlet boundary conditions and with the cost functional defined in (10). Also in this section, we first discuss the open-loop case where \(b\left( x,t\right) =v\left( t\right) +x\circ w\left( t\right) \) and identify u with the pair (vw). Thereafter, we focus on the closed-loop case.

In the first case, we consider the following optimal control problem

$$\begin{aligned} \begin{aligned}&\min _{f,v,w}J\left( f,v,w\right) :=\int _{0}^{T}\int _{\Omega }G\left( x,t,v,w\right) \left( x,t\right) f\left( x,t\right) dxdt+\int _{\Omega }F\left( x\right) f\left( x,T\right) dx\\&{\mathrm {s.t.\ }}\int _{\Omega }\left( f^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla f\left( x,t\right) \cdot \nabla \varphi \left( x\right) \right. \\&\quad \left. +\nabla \left( \left( v\left( t\right) +x\circ w\left( t\right) \right) \, f\left( x,t\right) \right) \varphi \left( x\right) \right) dx=0\\&\ \quad \ \ {\mathrm {a.e.\ in\ }}\left( 0,T\right) \ {\mathrm {for\ all\ }}\varphi \in H_{0}^{1}\left( \Omega \right) \\&f\left( \cdot ,0\right) =f_{0}\\&v\in V_{ad},\ w\in W_{ad} . \end{aligned} \end{aligned}$$
(30)

Suppose that the purpose of the control is that all realizations of the stochastic process (2) track a desired \(x_d \in L^2(0,T;{\mathbb {R}}^n)\), and attain a given final configuration \(x_T \in {\mathbb {R}}^n\) at final time (possibly with \(x_d(T) \ne x_T\)), while the cost of the control is kept at a minimum. (The ability to track and attain given values as well as the cost are meant in terms of statistical mean.)

Then, we can think of G having a composite structure

$$\begin{aligned} G\left( x,t,v,w\right) \left( x,t\right) :=A\left( x,t\right) +\alpha \, g_{s_{1}}\left( v\left( t\right) \right) +\beta \, g_{s_{2}}\left( w\left( t\right) \right) , \end{aligned}$$
(31)

where A plays the role of an attracting potential (i.e. a well centred at a desired minimum point, such that the negative of the gradient of the potential is directed towards this minimum). For example as in [15, 16], one could choose \(A\big (x,t\big )=\big (x - x_d(t)\big )^2\), and similarly \(F\big (x\big )=\big (x-x_T\big )^2\). Moreover, as in [8], we may choose \(A\big (x,t\big )=-\exp \big ( -\big (x - x_d(t)\big )^2 / c\big )\). In both cases, the minimum of the tracking part of the functional corresponds to have the PDF concentrated on the desired path. In general, we assume \(A\in L^{q}\left( 0,T;W^{1,q}\left( \Omega \right) \right) \cap L^{\infty }\left( Q\right) \) bounded from below.

On the other hand, we consider costs of the controls in the form

$$\begin{aligned} g_s:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}},\ z\mapsto g_{s}\left( z\right) :=\sum _{i=1}^{n}g_{s}^{i}\left( z^{i}\right) , \qquad s \ge 0 , \end{aligned}$$

where the \(g_{s}^{i}\) are non-negative functions, and in (31) we have \(s_{1},s_{2}\ge 0\) and \(\alpha ,\beta \ge 0\). The lower boundedness of the cost functional is ensured by the boundedness of A, \(g_{s_{1}}\), \(g_{s_{2}}\) and the fact that \(f \ge 0\). Clearly, the choice \(g^i_{s}(z)=(z^i)^2\) corresponds in the functional to a mean \(L^2\) cost. On the other hand, the choice \(g^i_{s}(z)=|z^i |\) corresponds to a mean \(L^1\) cost. In general, we assume that the \(g_s^i\) are convex and Lipschitz continuous.

With this setting, existence of a solution to (30) can be proved as in [22, 5 Existence of optimal controls] with the following additional arguments. The function G is bounded by a constant, and the control-to-state map is sequentially continuous as a map from the admissible set of controls to \(L^2\left( Q\right) \), see [22, Remark 5.1]. This means that any weakly converging sequence results in a strongly converging subsequence of the state, which ensures that we can proceed similarly as in the proof of [22, Corollary 5.1] in our case.

Our purpose is to discuss in detail the characterization of solutions to (30) in the PMP framework. For this reason, we present some preparatory results that are required for proving Theorem 5 below.

Central to the formulation of the PMP is the so-called Hamiltonian function \(H:{\mathbb {R}}^{n}\times {\mathbb {R}}\times {\mathbb {R}}\times K_{V}\times K_{W}\times {\mathbb {R}}^{n}\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} H\left( x,t,f,v,w,\zeta \right) :=G\left( x,t,v,w\right) f+\zeta \cdot \left( v+x\circ w\right) \, f. \end{aligned}$$

Notice that, when f, v, w, \(\zeta \) are functions, we shorten notation and write

$$\begin{aligned} H\left( x,t,f,v,w,\zeta \right) :=H\left( x,t,f\left( x,t\right) ,v\left( x,t\right) ,w\left( x,t\right) ,\zeta \left( x,t\right) \right) . \end{aligned}$$

Later the place holder \(\zeta \) will be filled with the spatial derivative of the solution to the adjoint FP problem.

The next step in order to characterise a solution to (30) in the PMP framework is the following lemma that provides a direct relationship between the values of the cost functional at different triples (fvw) and the values of the corresponding Hamiltonian. We have that \(\left( f_{1},v_{1},w_{1}\right) \) solves (17) for \(f_{1}\) in place of f, with \(v_{1}\) in place of v, and with \(w_{1}\) in place of w. (We also write \(f\leftarrow f_{1}\), \(v\leftarrow v_{1}\) and \(w\leftarrow w_{1}\).) We have

Lemma 4

Let\(\left( f_{1},v_{1},w_{1}\right) \)and\(\left( f_{2},v_{2},w_{2}\right) \)each be solutions to (17). Then, it holds

$$\begin{aligned} J\left( f_{1},v_{1},w_{1}\right) -J\left( f_{2},v_{2},w_{2}\right)&= \int _{0}^{T}\int _{\Omega }H\left( x,t,f_{2},v_{1},w_{1},\nabla p_{1}\right) \\&\quad -H\left( x,t,f_{2},v_{2},w_{2},\nabla p_{1}\right) dxdt, \end{aligned}$$

where\(p_{1}\)is given by (28) for\(v\leftarrow v_{1}\)and\(w\leftarrow w_{1}\).

Proof

In order to save notational effort, we drop the functions’ dependency with respect to xt. We have

$$\begin{aligned} \begin{aligned}&J\left( f_{1},v_{1},w_{1}\right) -J\left( f_{2},v_{2},w_{2}\right) \\&\quad =\int _{0}^{T}\int _{\Omega }G\left( v_{1},w_{1}\right) f_{1}dxdt+\int _{\Omega }Ff_{1}\left( \cdot ,T\right) dx\\&\qquad -\int _{0}^{T}\int _{\Omega }G\left( v_{2},w_{2}\right) f_{2}dxdt-\int _{\Omega }Ff_{2}\left( \cdot ,T\right) dx\\&\quad =\int _{0}^{T}\int _{\Omega }G\left( v_{1},w_{1}\right) f_{2}+G\left( v_{1},w_{1}\right) \left( f_{1}-f_{2}\right) \\&\qquad -\,G\left( v_{2},w_{2}\right) f_{2}dxdt+\int _{\Omega }F\left( f_{1}-f_{2}\right) \left( \cdot ,T\right) dx, \end{aligned} \end{aligned}$$
(32)

and

$$\begin{aligned} \begin{aligned}&\int _{0}^{T}\int _{\Omega }G\left( v_{1},w_{1}\right) \left( f_{1}-f_{2}\right) \\&\quad =\int _{0}^{T}\int _{\Omega }-p_{1}^{\prime}\left( f_{1}-f_{2}\right) +\frac{\sigma ^{2}}{2}\nabla p_{1}\cdot \nabla \left( f_{1}-f_{2}\right) \\&\qquad -\,\left( v_{1}+x\circ w_{1}\right) \cdot \nabla p_{1}\left( f_{1}-f_{2}\right) dx\\&\quad =\int _{0}^{T}\int _{\Omega }\left( f_{1}^{\prime}-f_{2}^{\prime}\right) p_{1}+\frac{\sigma ^{2}}{2}\left( \nabla f_{1}-\nabla f_{2}\right) \cdot \nabla p_{1}\\&\qquad +\,\left( \nabla \left( \left( v_{1}+x\circ w_{1}\right) f_{1}\right) -\nabla \left( \left( v_{1}+x\circ w_{1}\right) f_{2}\right) \right) p_{1}dxdt\\&\qquad -\int _{\Omega }F\left( f_{1}-f_{2}\right) \left( \cdot ,T\right) dx\\&\quad =\int _{0}^{T}\int _{\Omega }\nabla \left( \left( v_{2}+x\circ w_{2}\right) f_{2}\right) p_{1}\\&\qquad -\nabla \left( \left( v_{1}+x\circ w_{1}\right) f_{2}\right) p_{1}dxdt-\int _{\Omega }F\left( f_{1}-f_{2}\right) \left( \cdot ,T\right) dx, \end{aligned} \end{aligned}$$
(33)

by partial integration with respect to t [38, Satz 3.11], partial integration with respect to x (second line) and with (17) (fourth line). Combining (32) and (33), we obtain

$$\begin{aligned} \begin{aligned}&J\left( f_{1},v_{1},w_{1}\right) -J\left( f_{2},v_{2},w_{2}\right) \\&\quad =\int _{0}^{T}\int _{\Omega }G\left( v_{1},w_{1}\right) f_{2}-\nabla \left( \left( v_{1}+x\circ w_{1}\right) f_{2}\right) p_{1}\\&\qquad -\,G\left( v_{2},w_{2}\right) f_{2}+\nabla \left( \left( v_{2}+x\circ w_{2}\right) f_{2}\right) p_{1}dxdt\\&\quad =\int _{0}^{T}\int _{\Omega }G\left( v_{1},w_{1}\right) f_{2}+\left( \left( v_{1}+x\circ w_{1}\right) f_{2}\right) \cdot \nabla p_{1}\\&\qquad -\,G\left( v_{2},w_{2}\right) f_{2}-\left( \left( v_{2}+x\circ w_{2}\right) f_{2}\right) \cdot \nabla p_{1}dxdt\\&\quad =\int _{0}^{T}\int _{\Omega }H\left( x,t,f_{2},v_{1},w_{1},\nabla p_{1}\right) -H\left( x,t,f_{2},v_{2},w_{2},\nabla p_{1}\right) dxdt. \end{aligned} \end{aligned}$$

\(\square \)

Next, we recall that the standard step for the PMP characterisation of a solution to the FP optimal control (30) is to introduce the concept of needle variation. For this purpose, we define the needle variation for any \({\tilde{v}}\in V_{ad}\) and \({\tilde{w}}\in W_{ad}\) with \(t_{0}\in \left( 0,T\right) \) and with \(S_{k}\left( t_{0}\right) \), a ball centred in \(t_{0}\) and for its measure \(|S_{k}\left( t_{0}\right) |\) it holds that \(\lim _{k\rightarrow \infty }|S_{k}\left( t_{0}\right) |=0\), as follows

$$\begin{aligned} v_{k}\left( t\right) :={\left\{ \begin{array}{ll} {\tilde{v}}\left( t\right) & {\mathrm {\ if\ }}t\in \left( 0,T\right) \backslash S_{k}\left( t_{0}\right) \\ v & {\mathrm {\ if\ }}t\in S_{k}\left( t_{0}\right) \cap \left( 0,T\right) \end{array}\right. },\quad w_{k}\left( t\right) :={\left\{ \begin{array}{ll} {\tilde{w}}\left( t\right) & {\mathrm {\ if\ }}t\in \left( 0,T\right) \backslash S_{k}\left( t_{0}\right) \\ w & {\mathrm {\ if\ }}t\in S_{k}\left( t_{0}\right) \cap \left( 0,T\right) \end{array}\right. }, \end{aligned}$$

where \(v\in K_{V}\) and \(w\in K_{W}\). These variations should be understood componentwise for all components of v and w.

Now, we can state the PMP characterisation of an optimal control to (30) as follows.

Theorem 5

Let\(\left( {\bar{f}},{\bar{v}},{\bar{w}}\right) \)be a solution to (30). Then it holds that

$$\begin{aligned} \int _{\Omega }H\left( x,t,{\bar{f}},{\bar{v}},{\bar{w}},\nabla {\bar{p}}\right) dx=\min _{v\in K_{V},w\in K_{W}}\int _{\Omega }H\left( x,t,{\bar{f}},v,w,\nabla {\bar{p}}\right) dx, \end{aligned}$$
(34)

for almost all\(t\in \left( 0,T\right) \)where\({\bar{p}}\)is the solution to (28) for\(v\leftarrow {\bar{v}}\)and\(w\leftarrow {\bar{w}}\).

Proof

Since \(v_{k}\in V_{ad}\) and \(w_{k}\in W_{ad}\) for all \(k\in {\mathbb {N}}\), we have with Lemma 4 that for any \(k\in {\mathbb {N}}\) the following holds

$$\begin{aligned} \begin{aligned}&0\le \frac{1}{|S_{k}\left( t_{0}\right) |}\left( J\left( f_{k},v_{k},w_{k}\right) -J\left( {\bar{f}},{\bar{v}},{\bar{w}}\right) \right) \\&\quad =\frac{1}{|S_{k}\left( t_{0}\right) |}\left( \int _{0}^{T}\int _{\Omega }H\left( x,t,{\bar{f}},v_{k},w_{k},\nabla p_{k}\right) -H\left( x,t,{\bar{f}},{\bar{v}},{\bar{w}},\nabla p_{k}\right) dxdt\right) \\&\quad =\frac{1}{|S_{k}\left( t_{0}\right) |}\left( \int _{S_{k}\left( t_{0}\right) }\int _{\Omega }H\left( x,t,{\bar{f}},v,w,\nabla {\bar{p}}\right) -H\left( x,t,{\bar{f}},{\bar{v}},{\bar{w}},\nabla {\bar{p}}\right) dxdt\right) \\&\qquad +\frac{1}{|S_{k}\left( t_{0}\right) |}\left( \int _{S_{k}\left( t_{0}\right) }\int _{\Omega }\left( \nabla p_{k}-\nabla {\bar{p}}\right) \cdot \left( v+x\circ w\right) {\bar{f}}\right. \\&\qquad \left. +\left( \nabla {\bar{p}}-\nabla p_{k}\right) \cdot \left( {\bar{v}}+x\circ {\bar{w}}\right) {\bar{f}}dxdt\right) \\&\quad =\frac{1}{|S_{k}\left( t_{0}\right) |}\left( \int _{S_{k}\left( t_{0}\right) }\int _{\Omega }H\left( x,t,{\bar{f}},v,w,\nabla {\bar{p}}\right) -H\left( x,t,{\bar{f}},{\bar{v}},{\bar{w}},\nabla {\bar{p}}\right) dxdt\right) \\&\qquad -\frac{1}{|S_{k}\left( t_{0}\right) |}\left( \int _{S_{k}\left( t_{0}\right) }\int _{\Omega }\left( p_{k}-{\bar{p}}\right) \nabla \left( \left( v+x\circ w\right) {\bar{f}}\right) \right. \\&\left. \qquad +\left( {\bar{p}}-p_{k}\right) \nabla \left( \left( {\bar{v}}+x\circ {\bar{w}}\right) {\bar{f}}\right) dxdt\right) , \end{aligned} \end{aligned}$$
(35)

for all \(v\in K_{V}\) and \(w\in K_{W}\) where \(\left( f_{k},v_{k},w_{k}\right) \) solves (17) for \(\left( f,v,w\right) \leftarrow \left( f_{k},v_{k},w_{k}\right) \).

Next, we prove that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert p_{k}-{\bar{p}}\Vert _{L^{\infty }\left( Q\right) }=0. \end{aligned}$$

We subtract (28) for \(v\leftarrow v_{k}\) and \(w\leftarrow w_{k}\) from (28) for \(v\leftarrow {\bar{v}}\) and \(w\leftarrow {\bar{w}}\) and obtain

$$\begin{aligned} \begin{aligned}&\int _{\Omega }-\delta p^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla \delta p\left( x\right) \cdot \nabla \varphi \left( x\right) \\&\quad -\,\left( {\bar{v}}\left( t\right) +x\circ {\bar{w}}\left( t\right) \right) \cdot \nabla {\bar{p}}\left( x,t\right) \varphi \left( x\right) dx\\&\quad +\,\int _{\Omega }\left( v_{k}\left( t\right) +x\circ w_{k}\left( t\right) \right) \cdot \nabla p^{k}\left( x,t\right) \varphi \left( x\right) dx=\int _{\Omega }\left( G\left( {\bar{v}},{\bar{w}}\right) \left( x,t\right) \right. \\&\left. \quad -\,G\left( v_{k},w_{k}\right) \left( x,t\right) \right) \varphi \left( x\right) dx, \end{aligned} \end{aligned}$$

where \(\delta p:={\bar{p}}-p^{k}\) and thus

$$\begin{aligned} \begin{aligned}&\int _{\Omega }-\delta p^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla \delta p\left( x\right) \cdot \nabla \varphi \left( x\right) -\left( v_{k}\left( t\right) +x\circ w_{k}\left( t\right) \right) \cdot \nabla \delta p\left( x,t\right) \varphi \left( x\right) dx\\&\quad =\int _{\Omega }\left( v_{k}\left( t\right) +x\circ w_{k}\left( t\right) -\left( {\bar{v}}\left( t\right) +x\circ {\bar{w}}\left( t\right) \right) \right) \cdot \nabla {\bar{p}}\left( x,t\right) \varphi \left( x\right) dx\\&\qquad +\int _{\Omega }\left( G\left( {\bar{v}},{\bar{w}}\right) \left( x,t\right) -G\left( v_{k},w_{k}\right) \left( x,t\right) \right) \varphi \left( x\right) dx. \end{aligned} \end{aligned}$$
(36)

From (36) and Theorem 3, we have that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert p_{k}-{\bar{p}}\Vert _{L^{\infty }\left( Q\right) }=0, \end{aligned}$$

if

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert \left( v_{k}+\left( \cdot \right) \circ w_{k}-\left( {\bar{v}}+\left( \cdot \right) \circ {\bar{w}}\right) \right) \cdot \nabla {\bar{p}}\Vert _{L^{q}\left( Q\right) }=0, \end{aligned}$$

and

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert G\left( {\bar{v}},{\bar{w}}\right) -G\left( v_{k},w_{k}\right) \Vert _{L^{q}\left( Q\right) }=0. \end{aligned}$$

For the first term, we have the following

$$\begin{aligned} \begin{aligned}&\int _{Q}|\left( \left( v_{k}\left( t\right) +x\circ w_{k}\left( t\right) \right) -\left( {\bar{v}}\left( t\right) +x\circ {\bar{w}}\left( t\right) \right) \right) \cdot \nabla {\bar{p}}\left( x,t\right) |^{q}dxdt\\&\quad \le c\int _{0}^{T}\int _{\Omega }\sum _{i=1}^{n}|\frac{\partial }{\partial x_{i}}{\bar{p}}\left( x,t\right) |^{q}dxdt\\&\quad \le c\int _{0}^{T}\Vert \nabla {\bar{p}}\left( \cdot ,t\right) \Vert _{L^{2}\left( \Omega \right) }^{q}dt\le c\Vert {\bar{p}}\Vert _{L^{q}\left( 0,T;W^{1,q}\left( \Omega \right) \right) }^{q}, \end{aligned} \end{aligned}$$

for a constant \(c>0\) due to \(p\in L^{q}\left( 0,T;W^{1,q}\left( \Omega \right) \right) \), \(q>\frac{n}{2}+1\), see [11]. This means that the function

$$\begin{aligned} t\mapsto \int _{\Omega }|\left( v_{k}\left( t\right) +x\circ w_{k}\left( t\right) -\left( {\bar{v}}\left( t\right) +x\circ {\bar{w}}\left( t\right) \right) \right) \cdot \nabla {\bar{p}}\left( x,t\right) |^{q}dx, \end{aligned}$$

is integrable, see [4, Theorem 6.11, Theorem 6.9] and we can apply the Average Value Theorem [29, Theorem 51] to obtain

$$\begin{aligned} \begin{aligned}&\lim _{k\rightarrow \infty }\Vert \left( v_{k}+\left( \cdot \right) \circ w_{k}-\left( {\bar{v}}+\left( \cdot \right) \circ {\bar{w}}\right) \right) \cdot \nabla {\bar{p}}\Vert _{L^{q}\left( Q\right) }\\&\quad =\lim _{k\rightarrow \infty }\int _{S_{k}\left( t_{0}\right) }\int _{\Omega }|\left( v\left( t\right) +x\circ w\left( t\right) -\left( {\bar{v}}\left( t\right) +x\circ {\bar{w}}\left( t\right) \right) \right) \cdot \nabla {\bar{p}}\left( x,t\right) |^{q}dxdt=0, \end{aligned} \end{aligned}$$

for almost all \(t_{0}\in \left( 0,T\right) \). Further, since

$$\begin{aligned} \Vert G\left( {\bar{v}},{\bar{w}}\right) -G\left( v_{k},w_{k}\right) \Vert _{L^{\infty }\left( Q\right) }<c, \end{aligned}$$

for all \({\bar{v}}\in V_{ad}\) and \({\bar{w}}\in W_{ad}\), we analogously have that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert G\left( {\bar{v}},{\bar{w}}\right) -G\left( v_{k},w_{k}\right) \Vert _{L^{2}\left( Q\right) }=\lim _{k\rightarrow \infty }\int _{S_{k}\left( t_{0}\right) }\int _{\Omega }|G\left( {\bar{v}},{\bar{w}}\right) -G\left( v,w\right) |^{2}dxdt=0, \end{aligned}$$

for almost all \(t_{0}\in \left( 0,T\right) \).

Now, we have that the last line in (35) goes to zero for \(k\rightarrow \infty \) due to

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert p_{k}-{\bar{p}}\Vert _{L^{\infty }\left( Q\right) }=0, \end{aligned}$$

and due to the Average Value Theorem [29, Theorem 51], because

$$\begin{aligned} \begin{aligned}&\int _{Q}\left|\nabla \left( \left( v\left( t\right) +x\circ w\left( t\right) \right) {\bar{f}}\left( x,t\right) \right) \right|dx\\&\quad =\int _{Q}\left|\left( \sum _{i=1}^{n}w^{i}\left( t\right) {\bar{f}}\left( x,t\right) \right) +\left( v\left( t\right) +x\circ w\left( t\right) \right) \cdot \nabla {\bar{f}}\left( x,t\right) \right|dxdt\\&\quad \le c\Vert f\Vert _{L^{2}\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) }, \end{aligned} \end{aligned}$$

using the Poincaré inequality [2, 6.7] and the function

$$\begin{aligned} t\mapsto \int _{\Omega }\nabla \left( \left( v\left( t\right) +x\circ w\left( t\right) \right) {\bar{f}}\left( x,t\right) \right) dx, \end{aligned}$$

is locally integrable, see Fubini’s Theorem [4, X Theorem 6.11, Theorem 6.9]. Since

$$\begin{aligned} \left( x,t\right) \mapsto H\left( x,t,{\bar{f}},v,w,\nabla {\bar{p}}\right) -H\left( x,t,f_{2},v_{2},w_{2},\nabla {\bar{p}}\right) , \end{aligned}$$

is measurable on Q, see [18, Proposition 2.1.7] and an element of \(L^{1}\left( Q\right) \), we apply Fubini’s Theorem [4, Theorem 6.11, Theorem 6.9] and obtain

$$\begin{aligned} t\mapsto \int _{\Omega }H\left( x,t,{\bar{f}},v,w,\nabla {\bar{p}}\right) -H\left( x,t,{\bar{f}},\nabla {\bar{f}},{\bar{v}},{\bar{w}},\nabla {\bar{p}}\right) dx\in L^{1}\left( 0,T\right) , \end{aligned}$$

and thus we obtain from the Average Value Theorem [29, Theorem 51] the following

$$\begin{aligned} 0\le \int _{\Omega }\left( H\left( x,t,{\bar{f}},v,w,\nabla {\bar{p}}\right) -H\left( x,t,{\bar{f}},{\bar{v}},{\bar{w}},\nabla {\bar{p}}\right) \right) dx, \end{aligned}$$

by taking the limit over k on both sides of the inequality (35) for all \(v\in K_{V}\) and \(w\in K_{W}\) and for almost all \(t\in \left( 0,T\right) \), renaming \(t_{0}\) into t.

\(\square \)

We remark that the (unusual) integral form of the PMP given in (34) results from the fact that the controls depend only on time variable, and so the needle variation. This is in contrast to the case where the control depends on both variables (xt), see, e.g., [13, 14, 30] and references therein, in which case the needle variation is defined in Q.

We also see that (34) involves the PDF, which is consistent with the fact that we are characterizing an open-loop control for (15). The situation is different for our second FP optimal control problem corresponding to the stochastic process (16). In this case, the drift has the closed-loop structure that leads to important consequences that we illustrate below.

Our second FP optimal control problem is given by

$$\begin{aligned} \begin{aligned}&\min _{f,u}J\left( f,u\right) :=\int _{0}^{T}\int _{\Omega }G\left( x,t,u\right) \left( x,t\right) f\left( x,t\right) dxdt+\int _{\Omega }F\left( x\right) f\left( x,T\right) dx\\&{\mathrm {s.t.\ }}\int _{\Omega }\left( f^{\prime}\left( x,t\right) \varphi \left( x\right) +\frac{\sigma ^{2}}{2}\nabla f\left( x,t\right) \cdot \nabla \varphi \left( x\right) \right. \\&\quad \left. +\,\nabla \cdot \left( u\left( x,t\right) \, f\left( x,t\right) \right) \varphi \left( x\right) \right) dx=0\\&\ \quad \ \ {\mathrm {a.e.\ in\ }}\left( 0,T\right) \ {\mathrm {for\ all\ }}\varphi \in H_{0}^{1}\left( \Omega \right) \\&f\left( \cdot ,0\right) =f_{0}\\&u\in U_{ad} , \end{aligned} \end{aligned}$$
(37)

Similar to (31), we assume the structure \(G\left( x,t,u\right) \left( x,t\right) :=A\left( x,t\right) +\alpha \, g_{s}\left( u\left( x,t\right) \right) \). The PMP Hamiltonian corresponding to (37) is given by

$$\begin{aligned} H\left( x,t,f,u,\zeta \right) :=\left( G\left( x,t,u\right) +\zeta \cdot u \right) \, f . \end{aligned}$$
(38)

Next, we prove that a solution to (37) has the following PMP characterization.

Theorem 6

Let\(\left( {\bar{f}},{\bar{u}}\right) \)be a solution to (37). Then it holds that

$$\begin{aligned} H\left( x,t,{\bar{f}},{\bar{u}},\nabla {\bar{p}}\right) =\min _{u\in K_{U}}H\left( x,t,{\bar{f}},u,\nabla {\bar{p}}\right) , \end{aligned}$$
(39)

for almost all\(\left( x,t\right) \in Q\), where\({\bar{p}}\)is the solution to (29) for\(u\leftarrow {\bar{u}}\).

Proof

We have that \(p\in L^{q}\left( 0,T;W_{0}^{1,q}\right) \) due to the regularity of the right hand-side of (29), see [11] and [32, Proposition 8.35]. By [22, Theorem 3.1], we have that \(f\in L^{2}\left( 0,T;H_{0}^{1}\left( \Omega \right) \right) \). Then the proofs of Lemma 4 and Theorem 5 can be done analogously, where the corresponding control terms are replaced by the control of (37), and the needle variation is now defined in Q. Going step by step through the mentioned proofs, we can apply the same arguments to the control u. \(\square \)

We remark that Theorem 6 is analogous to [14, Theorem 3.3] or [13, Theorem 2] with similar proofs. The main difference in the proof of Theorem 5 with respect to that of Theorem 6 is that in the former the needle variation is performed for functions in (0, T) whereas in the latter variations of functions in Q are considered.

Now, we focus on (38) and (39), and notice that our FP equation is uniformly parabolic. Therefore the PDF is almost everywhere non-negative, and we can write the PMP condition (39) in the following form

$$\begin{aligned} \left( G\left( x,t,{\bar{u}} \right) +\nabla {\bar{p}} \cdot {\bar{u}} \right) =\min _{ u \in K_{U}} \left( G\left( x,t,{u} \right) +\nabla {\bar{p}} \cdot {u} \right) , \end{aligned}$$
(40)

for almost all \(\left( x,t\right) \in Q\).

This last result and the fact that f does not enter in the formulation of the adjoint FP problem imply that the optimal control u is independent of the PDF and thus independent of the initial condition \(f_0\), and so on the initial condition of the stochastic process (16). Therefore the optimal control u has the structure and the significance of a closed-loop control (feedback law).

Notice that, since the Hamiltonian is continuous on the control argument, the control obtained in (40) as a result of an \(\mathop {\hbox {arg min}}\nolimits \)-function is measurable; see [31, 14.29 Example, 14.37 Theorem].

4 Numerical approximation and optimization methods

In this section, starting from the PMP characterisation of solutions to our FP control problems (30) and (37), we discuss two numerical solution procedures. In the first case, we implement the SQH method that was proposed in [14]. In the second case, we exploit the special structure of the resulting optimality system and the connection to the HJB problem to formulate a variant of the SQH solution procedure for determining the optimal control. This variant has similarity with a well-known implementation of the HJB equation [39]. However, to avoid confusion we call this procedure the SQH direct Hamiltonian (SQH-DH) method.

For the implementation of both methods, we illustrate the numerical approximation of the FP and adjoint FP problems. For this purpose, we consider the two-dimensional case, \(n=2\), and a square domain \(\Omega =(-\ell , \ell ) \times (-\ell , \ell ) \). We define a uniform grid \(\Omega _h\) given by

$$\begin{aligned} {\bar{\Omega }}_h = \lbrace {(x_i^1,x_j^2)\in {\mathbb {R}}^2:(x^1_i,x^2_j) = (-\ell + ih,-\ell + jh), \; i,j \in \lbrace {0,\ldots ,N_x }\rbrace }\rbrace , \end{aligned}$$

where \(N_x\) represents the number of subintervals in each direction, and \(h=2\ell /N_x\). Further, let \(\delta {t}=T/N_t\) be the time step size, and \(N_t\) denotes the number of time steps. Define

$$\begin{aligned} Q_{h,\delta {t}} = \lbrace {(x^1_i,x^2_j,t_m):(x^1_i,x^2_j)\in \Omega _h,~t_m=m\, \delta {t},~0\le m\le N_t}\rbrace . \end{aligned}$$

On this grid, \(\phi _{i,j}^m\) represents the value of a grid function in \(\Omega _h\) at \((x_i^1,x_j^2)\) and time \(t_m\).

For the space discretization of the FP equation, we consider the Chang–Cooper (CC) scheme that is a second-order accurate spatial discretization scheme which guarantees positivity of the PDF [5, 17, 28]. For the formulation of the CC scheme one considers the flux form of the FP equation \(\partial _ t f = \nabla \cdot {\mathcal {F}}(f)\), where the ith component of the flux \({\mathcal {F}}\) is given by

$$\begin{aligned} {\mathcal {F}}_i (f)= \sum _{j=1}^{n}\partial _{x_j}( a_{ij}(x) \, f(x,t) ) - b_{i}(x,t) \, f(x,t) , \qquad i=1,2 . \end{aligned}$$
(41)

In the CC method we have the following finite-volume approximation

$$\begin{aligned} \nabla \cdot { {\mathcal {F}} }=\frac{1}{h}\left\lbrace {\left({\mathcal {F}}_{i+\frac{1}{2},j}^m-{\mathcal {F}}_{i-\frac{1}{2},j}^m\right)+\left({\mathcal {F}}_{i,j+\frac{1}{2}}^m-{\mathcal {F}}_{i,j-\frac{1}{2}}^m\right)}\right\rbrace , \end{aligned}$$

where \({\mathcal {F}}_{i+\frac{1}{2},j}^m\) and \({\mathcal {F}}_{i,j+\frac{1}{2}}^m\) represent the flux in the ith and jth direction, respectively. To compute these flux terms, Chang and Cooper proposed to use a linear convex combination of values of f at the cells sharing the same edge. For example, considering the edge between the grid points ij and \(i+1,j\), we have

$$\begin{aligned} f^{m}_{i+1/2,j} =\left(1-\delta _i^j\right)\, f^{m}_{i+1,j} + \delta _i^j \, f^{m}_{i,j}, \end{aligned}$$

where the value of \(\delta _i^j\) is specified as follows. Define \(B_{i+\frac{1}{2},j}^m = -b_1(x^1_{i+\frac{1}{2}},x^2_j,t_m)\) and \(B_{i,j+\frac{1}{2}}^m = -b_2(x^1_{i},x^2_{j+\frac{1}{2}},t_m)\). Thus, we have

$$\begin{aligned} \begin{aligned}&\delta _i^j = \frac{1}{w_i^j}-\frac{1}{\exp (w_i^j)-1}, \qquad w_i^j = 2hB_{i+\frac{1}{2},j}^m/\sigma ^2,\\&\delta _j^i = \frac{1}{w_j^i}-\frac{1}{\exp (w_j^i)-1}, \qquad w_j^i = 2hB_{i,j+\frac{1}{2}}^m/\sigma ^2. \end{aligned} \end{aligned}$$
(42)

Therefore the numerical fluxes are given by

$$\begin{aligned} {\mathcal {F}} _{i+\frac{1}{2},j}^m = \left[ {(1-\delta _i^j)B_{i+\frac{1}{2},j}^m + \frac{\sigma ^2}{2h}}\right] f_{i+1,j}^{m} - \left[ {\frac{\sigma ^2}{2h}-\delta _i^j B_{i+\frac{1}{2},j}^m }\right] f_{i,j}^{m}, \end{aligned}$$
(43)

and

$$\begin{aligned} {\mathcal {F}} _{i,j+\frac{1}{2}}^m = \left[ {(1-\delta _j^i)B_{i,j+\frac{1}{2}}^m + \frac{\sigma ^2}{2h}}\right] f_{i,j+1}^{m} - \left[ {\frac{\sigma ^2}{2h}-\delta _j^i B_{i,j+\frac{1}{2}}^m }\right] f_{i,j}^{m} . \end{aligned}$$
(44)

For the time discretization, we consider a combination of the first- and second-order implicit Euler schemes. Specifically, for the first time step, we implement the following backward Euler scheme as follows

$$\begin{aligned} \frac{f_{i,j} ^{m}-f_{i,j} ^{m-1}}{\delta {t}} = \frac{1}{h}\left({\mathcal {F}}_{i+\frac{1}{2},j}^{m}-{\mathcal {F}}_{i-\frac{1}{2},j}^{m}\right)+ \frac{1}{h}\left({\mathcal {F}}_{i,j+\frac{1}{2}}^{m}-{\mathcal {F}}_{i,j-\frac{1}{2}}^{m}\right), \end{aligned}$$
(45)

where \(m=1\). On the other hand, for \(m=2, \ldots , N_t\), we use the following second-order (BDF2) scheme

$$\begin{aligned} \frac{3 f_{i,j} ^{m}-4 f_{i,j} ^{m-1} + f_{i,j} ^{m-1}}{2\delta {t}} = \frac{1}{h}\left({\mathcal {F}}_{i+\frac{1}{2},j}^{m}-{\mathcal {F}}_{i-\frac{1}{2},j}^{m}\right)+ \frac{1}{h}\left({\mathcal {F}}_{i,j+\frac{1}{2}}^{m}-{\mathcal {F}}_{i,j-\frac{1}{2}}^{m}\right), \end{aligned}$$
(46)

where \(m=2, \ldots , N_t\). The numerical analysis of these schemes is presented in [28], where it is proved that the resulting numerical solution is \(O(\delta t + h^2)\) accurate with (45), and \(O(\delta t^2 + h^2)\) accurate with (46). Therefore, in order to guarantee a global second-order accurate solution, the first scheme is used to solve the FP problem on the interval \([0,\delta t]\) with \(\tilde{\delta t}=\delta t ^2\) step sizes.

Now, concerning the adjoint FP equation, it has been proved in [5, 34] that the transpose of (45) provides an \(O(\delta t + h^2)\) accurate approximation of the adjoint FP equation. This approximation is as follows

$$\begin{aligned} \begin{aligned} p_{i,j}^{m-1}&=S\left( p^{m},u^{m}\right) \\&:=p_{i,j}^{m}+\frac{\delta t}{h}\left( K_{i-\frac{1}{2},j}^{m}p_{i-1,j}^{m}-R_{i+\frac{1}{2}.j}^{m}p_{i,j}^{m}-K_{i-\frac{1}{2},j}^{m}p_{i,j}^{m}+R_{i+\frac{1}{2},j}^{m}p_{i+1,j}^{m}\right) \\&\quad +\frac{\delta t}{h}\left( K_{i,j-\frac{1}{2}}^{m}p_{i,j-1}^{m}-R_{i.j+\frac{1}{2}}^{m}p_{i,j}^{m}-K_{i,j-\frac{1}{2}}^{m}p_{i,j}^{m}+R_{i,j+\frac{1}{2}}^{m}p_{i,j+1}^{m}\right) +\delta t \, G\left( b^{m}\right) , \end{aligned} \end{aligned}$$
(47)

where

$$\begin{aligned} K_{i+\frac{1}{2},j}^{m}&= \left( 1-\delta _{i}^j\right) B_{i+\frac{1}{2},j}^{m}+\frac{\sigma ^{2}}{h},\quad K_{i-\frac{1}{2},j}^{m}=\left( 1-\delta _{i-1}^j\right) B_{i-\frac{1}{2},j}^{m}+\frac{\sigma ^{2}}{h}, \\ K_{i,j+\frac{1}{2}}^{m}&= \left( 1-\delta _{j}^i\right) B_{i,j+\frac{1}{2}}^{m}+\frac{\sigma ^{2}}{h},\quad K_{i,j-\frac{1}{2}}^{m}=\left( 1-\delta _{j-1}^i\right) B_{i,j-\frac{1}{2}}^{m}+\frac{\sigma ^{2}}{h}, \\ R_{i+\frac{1}{2},j}^{m}&= -\delta _{i}^j B_{i+\frac{1}{2},j}^{m}+\frac{\sigma ^{2}}{h},\quad R_{i-\frac{1}{2},j}^{m}=-\delta _{i-1}^j B_{i-\frac{1}{2},j}^{m}+\frac{\sigma ^{2}}{h}, \\ R_{i,j+\frac{1}{2}}^{m}&= -\delta _{j}^i B_{i,j+\frac{1}{2}}^{m}+\frac{\sigma ^{2}}{h},\ \ R_{i,j-\frac{1}{2}}^{m}=-\delta _{j-1}^i B_{i,j-\frac{1}{2}}^{m}+\frac{\sigma ^{2}}{h} . \end{aligned}$$

A similar result also holds for the scheme (46); see, e.g., [5, 28, 34].

Next, we discuss our numerical SQH optimization scheme [13, 14]. The main feature of this method is to consider the following augmented Hamiltonian

$$\begin{aligned} K_{\epsilon }\left( x,t,f,v,{\tilde{v}},w,{\tilde{w}},\zeta \right)&:= H\left( x,t,f,v,w,\zeta \right) +\epsilon \left( \left( v\left( t\right) -{\tilde{v}}\left( t\right) \right) ^{2}\right. \\&\left. +\left( w\left( t\right) -{\tilde{w}}\left( t\right) \right) ^{2}\right) , \end{aligned}$$

where \(v^{2}:=\sum _{i=1}^{n}\left( v^{i}\right) ^{2}\) for any vector \(v\in {\mathbb {R}}^{n}\). The quadratic term, which augments the Hamiltonian H, aims at penalising large updates of the control in a sweep that improves the control on all grid points where it is defined. In this way, the current values of the state and adjoint variables are approximately valid during the iteration sweep on all grid points and do not need to be updated during this process, but only after its completion. In this way, the SQH scheme provides an efficient and robust optimization procedure. The SQH method for the open-loop setting is schematically illustrated with the following algorithm.

Algorithm 1 (SQH method)

1. Choose \(\epsilon >0\), \(\kappa \ge 0\), \({\hat{\sigma }}>1\), \(\zeta \in \left( 0,1\right) \), \(\eta \in \left( 0,\infty \right) \), \(v^{0},w^{0}\), compute \(f^{0}\) by (17) and \(p^{0}\) by (28) for \(v\leftarrow v^{0}\), \(w\leftarrow w^{0}\), set \(k\leftarrow 0\)

2. Find \(v\in K_{V}\) and \(w\in K_{W}\) such that

\(\int _{\Omega }K_{\epsilon }\left( x,t,f^{k},v,v^{k},w,w^{k},\nabla p^{k}\right) dx\le \int _{\Omega }K_{\epsilon }\left( x,t,f^{k},{\hat{v}},v^{k},{\hat{w}},w^{k},\nabla p^{k}\right) dx\)

for all \({\hat{v}}\in K_{V}\) and \({\hat{w}}\in K_{W}\) and for all \(t\in \left[ 0,T\right] \)

3. Calculate f by (17) with vw and \(\tau _{1}:=\Vert v-v^{k}\Vert _{L^{2}\left( 0,T\right) }^{2}\), \(\tau _{2}:=\Vert w-w^{k}\Vert _{L^{2}\left( 0,T\right) }^{2}\)

4. If \(J\left( f,v,w\right) -J\left( f^{k},v^{k},w^{k}\right) >-\eta \left( \tau _{1}+\tau _{2}\right) \): Choose \(\epsilon \leftarrow {\hat{\sigma }} \epsilon \)

  Else: Choose \(\epsilon \leftarrow \zeta \epsilon \), \(f^{k+1}\leftarrow f\), \(v^{k+1}\leftarrow v\), \(w^{k+1}\leftarrow w\), calculate \(p^{k+1}\) by (28) with \(v^{k+1}\) and \(w^{k+1}\), set \(k\leftarrow k+1\)

5. If \(\tau _{1}+\tau _{2}<\kappa \): STOP and return \(v^{k}\) and \(w^{k}\)

 Else go to 2.

The well-posedness of this SQH scheme means that Step 2. to Step 4. are mathematically well defined and can always be performed. Specifically, the minimization in in Step 2. can always be performed, see [14, Lemma 4.1]. Further, if a control is attained that is PMP optimal, then the algorithm will stop, see [14, Lemma 4.3]. On the other hand, in Step 4., if no sufficient decrease of the value of J is attained, a larger value of \(\epsilon \) can always be found in finitely many steps such that we obtain an update of the control that satisfies the decrease condition given with \(\eta \); see [12, Lemma 50] and [14, (16)].

We remark that the convergence analysis of the SQH method for our FP control problems can be done analogously as in [13, 14]. However, for this purpose, we need to prove bounds for \(\Vert f\Vert _{L^\infty \left( 0,T;H^1_0\left( \Omega \right) \right) }\) and \(\Vert p\Vert _{L^\infty \left( 0,T;H^1_0\left( \Omega \right) \right) }\), which are not discussed in this paper.

The SQH scheme given above is applied to solve both FP optimal control problem (30).

In the case of the closed-loop control (37), we consider a variant of the SQH scheme that consistently implements the equivalence of the FP control problem with the HJB formulation assuming that the PDF is everywhere positive. This variant is obtained according to (40) by introducing the following augmented Hamiltonian

$$\begin{aligned} {{\tilde{K}}}_{\epsilon }\left( x,t,u,{\tilde{u}}, \zeta \right) :=\left( G\left( x,t,{u}(t) \right) +\zeta \cdot {u}(t) \right) +\epsilon \, \left( u\left( t\right) -{\tilde{u}}\left( t\right) \right) ^{2}. \end{aligned}$$

This choice of augmented Hamiltonian results naturally from the positivity of the PDF and by considering the expected value of the quadratic penalization.

Therefore we implement the following iterative scheme

Algorithm 2 SQH-DH method

1. Choose \(\epsilon >0\), \(\kappa \ge 0\), \({\hat{\sigma }}>1\), \(\zeta \in \left( 0,1\right) \), \(\eta \in \left( 0,\infty \right) \), \(u^{0}\), compute \(f^{0}\) by (17) and \(p^{0}\) by (28) for \(u\leftarrow u^{0}\), set \(k\leftarrow 0\)

2. Find \(u\in K_{U}\) such that

\({{\tilde{K}}}_{\epsilon }\left( x,t,u,u^{k},\nabla p^{k}\right) \le {{\tilde{K}}}_{\epsilon }\left( x,t,{\hat{u}},u^{k},\nabla p^{k}\right) \)

 for all \({\hat{u}}\in K_{U}\) and for all \(t\in \left[ 0,T\right] \)

3. Calculate f by (17) with u and \(\tau :=\Vert u-u^{k}\Vert _{L^{2}\left( Q\right) }^{2}\)

4. If \(J\left( f,u\right) -J\left( f^{k},u^{k}\right) >-\eta \, \tau \): Choose \(\epsilon \leftarrow {\hat{\sigma }}\epsilon \)

 Else: Choose \(\epsilon \leftarrow \zeta \epsilon \), \(f^{k+1}\leftarrow f\), \(u^{k+1}\leftarrow u\), calculate \(p^{k+1}\) by (28) with \(u^{k+1}\), set \(k\leftarrow k+1\)

5. If \(\tau <\kappa \): STOP and return \(u^{k}\)

 Else go to 2.

Notice that in the SQH-DH scheme the calculation of f is only required to evaluate the cost functional. Furthermore, at convergence the solution of the adjoint problem corresponds to solving the HJB equation as discussed in [39]. However, the gradual update of the control thanks to the quadratic penalization makes the SQH-DH approach more robust and does not suffer of instabilities for large time-step sizes.

We see that, in both SQH optimization schemes, Step 2. requires the point-wise evaluation of minimization problems for the augmented Hamiltonian in small dimensional (in our case 4, resp. 2, counting all components) compact sets. These problems can be solved by many methods considering a discretization of these sets. In particular, we refer to derivative-free optimization methods [19, 26]. See [13] for an application of the secant method in our context.

However, one great advantage of the SQH approach is that a case study of minimization of the augmented Hamiltonian can be performed beforehand by simple analytical tools, which delivers a choice of few possible minimizers depending on the input of the problem. In this case, one is required to evaluate \(K_\epsilon \) (or \({{\tilde{K}}}_\epsilon \)) on this points and choose the actual minimizer.

In our experience, this approach has large applicability as demonstrated in [13] where different cost functionals and PDE models are considered. In the following, we illustrate this fact for our specific setting. We focus on the FP optimal control problem (30), and choose

$$\begin{aligned} G\left( x,t,v,w\right) = A\left( x,t\right) +\alpha \, g_{s_{1}}\left( v\right) +\beta \, g_{s_{2}}\left( w\right) , \end{aligned}$$

where A is a continuous function that we specify in the numerical experiments section. We assume that the following (generalised) \(L^1\) control costs

$$\begin{aligned} g_{s}\left( z\right) :=\max \left( 0,|z^{1}|-s\right) +\max \left( 0,|z^{2}|-s\right) , \end{aligned}$$
(48)

where \(s_{1}=\frac{3}{5}\), \(s_{2}=\frac{3}{10}\). The admissible values of the controls are given by the intervals \(K_{V}^{1}=K_{V}^{2}=\left[ v_{\min },v_{\max }\right] \) and \(K_{W}^{1}=K_{W}^{2}=\left[ -w_{\min },w_{\max }\right] \), where \(v_{\min }=-2\), \(v_{\max }=2\), \(w_{\min }=-1\) and \(w_{\max }=1\).

We have

$$\begin{aligned}\begin{aligned}&\int _{\Omega }K_{\epsilon }\left( x,t,f,v,{\tilde{v}},w,{\tilde{w}},\nabla p\right) dx\\&\quad =\int _{\Omega }G\left( x,t, v,w\right) f\left( x,t\right) +\nabla p\left( x,t\right) \cdot \left( v+x\circ w\right) f\left( x,t\right) \\&\qquad +\,\epsilon \left( \left( v-{\tilde{v}}\left( t\right) \right) ^{2}+\left( w-{\tilde{w}}\left( t\right) \right) ^{2}\right) dx\\&\quad =G\left( x,t,v,w\right) \varpi \left( t\right) +\sum _{i=1}^{2}v^{i}\varrho _{i}\left( t\right) +\sum _{i=1}^{2}w^{i}\varsigma _{i}\left( t\right) \\&\qquad +\,\epsilon |\Omega |\left( \sum _{i=1}^{n}\left( v^{i}-{\tilde{v}}^{i}\left( t\right) \right) ^{2}+\sum _{i=1}^{n}\left( w^{i}-{\tilde{w}}^{i}\left( t\right) \right) ^{2}\right) , \end{aligned} \end{aligned}$$

where \(|\Omega |\) is the measure of \(\Omega \), \(\varpi \left( t\right) :=\int _{\Omega }f\left( x,t\right) dx\) and

$$\begin{aligned} \varrho _{i}\left( t\right) :=\int _{\Omega }\frac{\partial }{\partial x_{i}}p\left( x,t\right) f\left( x,t\right) dx,\ \ \varsigma _{i}\left( t\right) :=\int _{\Omega }x_{i}\frac{\partial }{\partial x_{i}}p\left( x,t\right) f\left( x,t\right) dx,\ \ i=\left\{ 1,2\right\} . \end{aligned}$$

We remark that due to the zero boundary conditions it holds that \(0\le a\le 1\). Since

$$\begin{aligned} \max \left( 0,|z|-s\right) ={\left\{ \begin{array}{ll} z-s & {\mathrm {\ if\ }}z>s\\ -z-s & {\mathrm {\ if\ }}z<-s\\ 0 & {\mathrm {\ if\ }}|z|\le s \end{array}\right. }, \end{aligned}$$

the pointwise minimum of \(\left( v,w\right) \mapsto \int _{\Omega }K_{\epsilon }\left( x,t,f,v,{\tilde{v}},w,{\tilde{w}},\nabla p\right) \) is given by the following case study where we use the differentiability of \(\left( v,w\right) \mapsto \int _{\Omega }K_{\epsilon }\left( x,t,f,v,{\tilde{v}},w,{\tilde{w}},\nabla p\right) \) in the intervals \(\left( v_{\min },-s\right) \), \(\left( -s,s\right) \), \(\left( s,v_{\max }\right) \) for v and analogously for w.

By inspection, we see that the minimization problem in the given intervals reduces to the evaluation of the value of integral of the augmented Hamiltonian on a discrete set of points of the admissible set of control values as follows

$$\begin{aligned}\begin{aligned}\left( v\left( t\right) ,w\left( t\right) \right) &=\mathop {\hbox {arg min}}\limits _{v\in K_{V},w\in K_{W}}\int _{\Omega }K_{\epsilon }\left( x,t,f,v,{\tilde{v}},w,{\tilde{w}},\nabla p\right) dx\\& =\mathop {\hbox {arg min}}\limits _{v\in {\tilde{K}}_{V}\left( t\right) ,w\in {\tilde{K}}_{W}\left( t\right) }\int _{\Omega }K_{\epsilon }\left( x,t,f,v,{\tilde{v}},w,{\tilde{w}},\nabla p\right) dx, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} {\tilde{K}}_{V}\left( t\right)&:= {\tilde{K}}_{V}^{1}\left( t\right) \times {\tilde{K}}_{V}^{2}\left( t\right) ,\ {\tilde{K}}_{W}:={\tilde{K}}_{W}^{1}\left( t\right) \times K_{W}^{2}\left( t\right) , \\ {\tilde{K}}_{V}^{i}\left( t\right)&:= \left\{ v_{1}^{i}\left( t\right) ,v_{2}^{i}\left( t\right) ,v_{3}^{i}\left( t\right) \right\} ,\ {\tilde{K}}_{W}^{i}:=\left\{ w_{1}^{i}\left( t\right) ,w_{2}^{i}\left( t\right) ,w_{3}^{i}\left( t\right) \right\} ,\ i=\left\{ 1,2\right\} , \end{aligned}$$

with

$$\begin{aligned} v_{1}^{i}\left( t\right)&= \min \left( \max \left( s_{1},\frac{2\epsilon |\Omega |{\tilde{v}}^{i}\left( t\right) -\alpha \varpi \left( t\right) -\varrho _{i}\left( t\right) }{2|\Omega |\epsilon }\right) ,v_{\max }\right) , \\ v_{2}^{i}\left( t\right)&= \min \left( \max \left( v_{\min },\frac{2\epsilon |\Omega |{\tilde{v}}^{i}\left( t\right) -\alpha \varpi \left( t\right) -\varrho _{i}\left( t\right) }{2|\Omega |\epsilon }\right) ,-s_{1}\right) , \\ v_{3}^{i}\left( t\right)&= \min \left( \max \left( -s_{1},\frac{2\epsilon |\Omega |{\tilde{v}}^{i}\left( t\right) -\varrho _{i}\left( t\right) }{2|\Omega |\epsilon }\right) ,s_{1}\right) , \\ w_{1}^{i}\left( t\right)&= \min \left( \max \left( s_{2},\frac{2\epsilon |\Omega |{\tilde{w}}^{i}\left( t\right) -\beta \varpi \left( t\right) -\varsigma _{i}\left( t\right) }{2|\Omega |\epsilon }\right) ,w_{\max }\right) , \\ w_{2}^{i}\left( t\right)&= \min \left( \max \left( w_{\min },\frac{2\epsilon |\Omega |{\tilde{w}}^{i}\left( t\right) -\beta \varpi \left( t\right) -\varsigma _{i}\left( t\right) }{2|\Omega |\epsilon }\right) ,-s_{2}\right) , \end{aligned}$$

and

$$\begin{aligned} w_{3}^{i}\left( t\right) =\min \left( \max \left( -s_{2},\frac{2\epsilon |\Omega |{\tilde{w}}^{i}\left( t\right) -\varsigma _{i}\left( t\right) }{2|\Omega |\epsilon }\right) ,s_{2}\right) , \end{aligned}$$

for any \(t\in \left[ 0,T\right] \), and \(i=\left\{ 1,2\right\} \) since the minimum is either in the inner of the corresponding intervals where the derivative with respect to v or w equals zero or on the boundary of the intervals, see [3, IV Remark 2.2 (b)].

Next, we consider the FP optimal control problem given in (37). In this case, we choose

$$\begin{aligned} G\left( x,t, u\right) =A\left( x,t\right) +\frac{\alpha }{2}\left( u_{1}^{2}+u_{2}^{2}\right) +\beta \left( |u_{1}|+|u_{2}|\right) , \end{aligned}$$

where \(u=(u_1,u_2)\) and A is as given above. The admissible set of values of the control is given by the interval \(K_U=[u_{\min }, u_{\max }]^2\) with \(u_{\min }=-10\), \(u_{\max }=10\).

Also in this case, we can determine a priori the set of points where the augmented Hamiltonians can take a minimum. Specifically, in the SQH-DH method, we have

$$\begin{aligned}\begin{aligned}u&=\mathop {\hbox {arg min}}\limits _{v\in K_{U}}\,\left( G\left( v\right) +\nabla _{h}p\cdot v\right) +\epsilon \left( v-{\tilde{u}}\right) ^{2}\\& =\mathop {\hbox {arg min}}\limits _{v_{1}\in \left\{ v_{1}^{1},v_{1}^{2}\right\} ,\,v_{2}\in \left\{ v_{2}^{1},v_{2}^{2}\right\} }\left( \frac{\alpha }{2}\left( v_{1}^{2}+v_{2}^{2}\right) +\beta \left( |v_{1}|+|v_{2}|\right) +v_{1}\nabla _{h}^{1}p+v_{2}\nabla _{h}^{2}p\right) \\&\quad +\,\epsilon \left( v-{\tilde{u}}\right) ^{2}, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} v_{i}^{1}=\min \left( \max \left( 0,\frac{2\epsilon {\tilde{u}}_{i}\left( x,t\right) -\nabla _{h}^{i}p\left( x,t\right) -\beta }{2\epsilon +\alpha }\right) ,u_{\max }\right) , \end{aligned}$$

and

$$\begin{aligned} v_{i}^{2}=\min \left( \max \left( u_{\min },\frac{2\epsilon {\tilde{u}}_{i}\left( x,t\right) -\nabla _{h}^{i}p\left( x,t\right) +\beta }{2\epsilon +\alpha }\right) ,0\right) . \end{aligned}$$

In the following section, we validate and compare our SQH schemes, where in Step 2. we take advantage of the pre-determined minimizers.

5 Numerical experiments

In this section, we report results of numerical experiments that validate our FP optimization framework and the ability of the resulting controls to drive the related stochastic processes in order to perform given tasks. Concerning the first goal, we would like to demonstrate that our optimization procedure is able to provide a solution that satisfies the PMP optimality conditions discussed in the previous sections. For this purpose, we define a measure of PMP optimality of the numerical solution of the open-loop setting as follows

$$\begin{aligned} \triangle H\left( t\right) :=\int _{\Omega }H\left( x,t,f,v,w,\nabla p\right) dx-\min _{{\tilde{v}}\in K_{V},{\tilde{w}}\in K_{W}}\int _{\Omega }H\left( x,t,f,{\tilde{v}},{\tilde{w}},\nabla p\right) dx, \end{aligned}$$

assuming that f, v, w and p represent the output of Algorithm 1 applied to our FP problem (30). Similarly, we define a measure of PMP optimality for the closed-loop problem (37).

We also report values of the variable \(N_{\%}^{l}\), \(l\in {\mathbb {N}}\) that give the percentage of grid points where \(0\le \triangle H\left( t\right) \le 10^{-l}\) is fulfilled.

Our computational setting for both control problems is as follows. We choose \(\Omega =\left( -2,2\right) \times \left( -2,2\right) \), and consider a uniform space and time discretisation with \(N_x=40\) and \(N_t=80\) and \(T=2\).

The initial condition for the FP problem is given by the following normalized Gaussian distribution

$$\begin{aligned} f_{0}\left( x\right) =\frac{1}{2\pi r^2} \, {\mathrm {e}}^{-\frac{|x-x_{0}|^{2}}{2 r^2}} , \end{aligned}$$
(49)

where \(r=0.3\) and \(x_{0}=(-1,0)\), and \(|\cdot |\) denotes the Euclidean norm in \({\mathbb {R}}^2\). In the FP equation, we choose a diffusion coefficient \(D=\frac{\sigma ^{2}}{2}=10^{-2}\).

In our cost functional for the open-loop problem, we specify the function \(G=A + \alpha \, g_{s_{1}}\left( v\right) +\beta \, g_{s_{2}}\left( w\right) \) choosing

$$\begin{aligned} A\left( x,t\right) = - \frac{10^{-3}}{2\pi r^2} \, {\mathrm {e}}^{-\frac{|x-x_d(t)|^{2}}{2 r^2}} , \end{aligned}$$
(50)

where \(x_{d}: [0,T]\rightarrow {\mathbb {R}}^{2}\) is given by the arc

$$\begin{aligned} x_{d}\left( t\right) =\left( \begin{array}{c} t-1\\ \sin \left( \pi \, t/2\right) \end{array}\right) . \end{aligned}$$
(51)

Further, we have \(g_s\) given in (48) with the parameters \(\alpha =\beta =10^{-4}\), and \(s_{1}=s_{2}=0\). The same function A and the same values \(\alpha =\beta =10^{-4}\) are chosen for our closed-loop problem. In both cases, the terminal function F is taken as follows

$$\begin{aligned} F\left( x\right) = - \frac{10^{-3}}{2\pi r^2} \, {\mathrm {e}}^{-\frac{|x-x_d(T)|^{2}}{2 r^2}} . \end{aligned}$$

The choice of A and F is motivated by the concept of ensemble control proposed in [15, 16] and analysed in [8]. To illustrate this setting, suppose that the purpose of the control is that the trajectories of our stochastic models are required to track the desired trajectory \(x_d(t)\), in the sense that minimizing this term corresponds to having all trajectories of the ensemble being close to \(x_d\). For this purpose, the function A(xt) represents an attracting potential that monotonically increases as a function of the distance \(|x-x_d(t)|\). Therefore, by minimization, we have that f results mainly concentrated on the minimum of A corresponding to \(x_d\) (a valley). Similarly, the purpose of \(F(x)=A(x,T)\) is to model the requirement that the density f at final time concentrates on a final position given by \(x_d(T) \).

The parameters for both SQH Algorithms are set as follows. The initial guess \(\epsilon =10^{2}\), and the controls are initialized by zero functions. We choose \(\eta =10^{-8}\), \({\hat{\sigma }}=25\), \(\zeta =10^{-1}\) and \(\kappa =10^{-14}\).

The main purpose of our experiments is to validate the resulting optimal control problems at the level of the stochastic dynamics. Therefore, once the controls are computed, we use them in Monte Carlo simulation to verify the ability of these controls to perform the given tasks. Moreover, by taking different initial conditions in the simulation of the controlled SDEs, we can test whether the controls have the closed-loop ability to drive the system to perform the given tasks from any initial configuration. In the following, we report results of numerical experiments with the setting above and for both control problems.

In Fig. 1a, we plot the open-loop optimal control functions \(v=(v_1,v_2)\) and \(w=(w_1,w_2)\) in [0, T], and in Fig. 1b, we depict the minimization history of the cost functional along the SQH iterations. Notice that the functional is monotonically decreasing. The numerical PMP test gives \(N_{\%}^{1}=100\%\), \(N_{\%}^{2}=100\%\), \(N_{\%}^{3}=100\%\), and \(N_{\%}^{4}=16\%\). These results indicate that the solution obtained with the SQH method is PMP optimal in the sense of (34) with a tolerance of \(10^{-4}\).

Fig. 1
figure 1

Results for the open-loop problem

In Fig. 2a, we plot the closed-loop optimal control functions \(u=(u_1,u_2)\) in \(\Omega \) at \(t=1\) and \(t=1.5\), and in Fig. 2b, we depict the minimization history of the cost functional along the SQH-DH iterations. Also in this case, notice that the functional is monotonically decreasing. The numerical PMP test gives \(N_{\%}^{1}=100\%\), \(N_{\%}^{2}=100\%\), \(N_{\%}^{3}=99\%\), \(N_{\%}^{4}=99\%\), \(N_{\%}^{5}=99\%\), \(N_{\%}^{6}=98\%\), \(N_{\%}^{7}=97\%\), \(N_{\%}^{8}=96\%\), \(N_{\%}^{9}=94\%\), \(N_{\%}^{10}=92\%\), \(N_{\%}^{11}=90\%\), and \(N_{\%}^{12}=88\%\), thus demonstrating that in this case the solution obtained with the SQH-DH method is PMP optimal close to machine precision.

Fig. 2
figure 2

Results for the closed-loop problem. The optimal control u1 (left) and u2 (right) at t = 1 (top) and t = 1.5 (bottom)

In order to allow a more direct comparison of the controls obtained with the two settings, in Fig. 3a, we plot again the closed-loop optimal control functions \(u=(u_1,u_2)\) in \(\Omega \) at \(t=1\) and \(t=1.5\), and compare it with Fig. 3b where we depict \((v + w \circ x)\) in \(\Omega \) at \(t=1\) and \(t=1.5\). We see that, although some similarities can be recognized, the two controls differ substantially. Notice that with these controls and the setting above, the total probability at final time differs from the initial one by less than \(10^{-4}\).

Fig. 3
figure 3

Comparison of closed-loop (left) and open-loop controls. The optimal control u1 (left) and u2 (right) at t = 1 (top) and t = 1.5 (bottom). The optimal control \(v + w \circ x\) with u1 (left) and u2 (right) at t = 1 (top) and t = 1.5 (bottom)

Next, we show that the two controls perform similarly well when the initial conditions for the two stochastic models coincide with \(x_0\) where \(f_0\) is centred. For this purpose, in Fig. 4 we plot the evolution of \({\mathbb {E}}[X(t)]\) for the two stochastic processes in [0, T]. However, in the same figure, a more detailed comparison is given by plotting a few (10) stochastic trajectories of the two models. We see that the distribution of these trajectories confirm the plots of the mean \({\mathbb {E}}[X(t)]\). On the other hand, we notice that the closed-loop control is more effective in attaining the tracking objective.

Fig. 4
figure 4

Evolution of \({\mathbb {E}}[X(t)]\) (circles); the dashed line depicts the desired trajectory. Left: the closed-loop case; right: the open-loop case

Now, we validate the ability of the computed controls to provide a feedback law. In the case of our closed-loop control this feature is expected by construction as discussed in this paper. On the other hand, we would like to investigate the claim in [15, 16] that our open-loop control provides a valuable approximation to a feedback control mechanism. In fact, the results in the Fig. 3a, b do in part support this claim. However, we perform a more strict validation by using these controls as drifts of our stochastic models and choosing an initial condition \(X_0\) that is far away from \(x_0\). The resulting trajectories are plotted in Fig. 5 and compared with the previous ones obtained with \(X_0=x_0\). We see that the closed-loop control is able to drive the SDE to follow the desired trajectory and attain the target configuration. On the other hand, the open-loop control mechanism appears not able to perform these tasks properly. Results of additional experiments confirm this conclusion.

Fig. 5
figure 5

Trajectories of the SDE models with the closed-loop control (top) and the open-loop control (bottom). Left: trajectories starting with \(X_0=x_0=(-1,0)\); right: trajectories starting at \(X_0=(1,1)\)

6 Conclusion

A theoretical and computational framework to investigate open- and closed-loop control strategies for stochastic models was presented. This framework is based on the Pontryagin maximum principle (PMP) applied to optimal control problems governed by the Fokker–Planck (FP) equation, which governs the evolution of the probability density function of these models.

In this work, existence and PMP characterisation of optimal controls for the FP control problems were discussed. Further, PMP-based numerical optimization schemes were implemented to solve these problems. Results of experiments were presented that successfully validated the effectiveness of the PMP FP optimization framework and the ability of the resulting controls to drive the stochastic models to perform a given task.