1 Introduction

This paper tries to combine two important topics in mechanics and control theory: Hamiltonian contact systems and Pontryagin maximum principle in optimal control.

On the one hand, Hamiltonian contact systems are getting a great popularity in recent times because they allow to describe dissipation dynamics, and several other types of physical systems in thermodynamics, quantum mechanics, circuit theory, control theory, etc. (see for instance Bravetti 2019; Goto 2016; Kholodenko 2013; Ramirez et al. 2017; de León and Sardón 2017; Gaset et al. 2020b; Simoes et al. 2020; Sussmann 1999). Recently, a generalization of contact geometry has been developed to describe field theories with dissipation (Gaset et al. 2020a, c). In fact, the Hamiltonian formulation in the scenario of contact structures exhibits very different characteristics to its counterpart in symplectic manifolds. Indeed, these differences are based on the fact that in the contact case they are Jacobi structures, more general than those of Poisson related to the symplectic ones. In variational terms, one can show that contact Hamiltonian equations can be derived from the so-called Herglotz principle, which includes as a particular case the classical Hamilton principle.

On the other hand, the Pontryagin maximum principle (PMP), see Pontryagin et al. (1962), Barbero-Liñan and Muñoz-Lecanda (2009) and references therein, is the most useful instrument for finding solutions to an optimal control problem. In fact, the PMP is the paradigm in the theory of optimal control, and since its formulation has never ceased research on its incredible properties, from very different points of view, although we will focus here on its more geometric aspects. An immediate issue arising from possible applications is that of studying problems of optimal control from the point of view of Hamiltonian contact systems, and therefore of systems with dissipative properties among many others. And, then, it seems very natural to ask whether a Pontryagin maximum principle could be developed to deal with a contact control problem. To our knowledge the relationship between contact Hamiltonian systems and the Pontryagin maximum principle was first noticed in Ohsawa (2015) and developed in Jóźwikowski and Respondek (2016).

Trying to look to both topics with a common viewpoint, we consider weather the solution curves to the Pontryagin maximum principle admit a formulation in terms of Hamiltonian contact systems in an adequate manifold. Conversely, we examine if the Herglotz variational problems can be understood as a particular class of optimal control problems.

With all this in mind, the paper is structured as follows. Sections 2 and 3 are dedicated to review the elements of Hamiltonian contact systems and Pontryagin maximum principle, both necessary to understand the object of the manuscript.

So, Sect. 2 is devoted just to recall the main notions and results about contact Hamiltonian systems, including the so-called Herglotz principle, a natural extension of the well-known Hamilton principle. As we said above, this section will facilitate a better understanding of the rest of the paper.

Section 3 is dedicated for the Pontryagin maximum principle in several formulations. We introduce the classical optimal control problem, the associated extended system, the classical Pontryagin maximum principle and its transformation into the symplectic and presymplectic formulations. This last one is used in several sections of the article.

In Sect. 4 we discuss an interesting particular case of Hamiltonian dynamics; indeed, given a vector field X on a manifold M, one can define the complete lift of X to the cotangent bundle \(T^*M\) which is just the Hamiltonian vector field corresponding to the Hamiltonian function determined by X: just its evaluation on the points of the cotangent bundle. Hence the dynamics of a general vector field is described as the corresponding to a Hamiltonian vector field in a symplectic manifold. But this dynamics on \(T^*M\) is richer than one could expect. In fact, if the manifold M decomposes as \(M = \mathbb {R}\times M_o\), and the vector field has a particular symmetry property, one has a very natural setting to identify two different geometric behaviours according to the value of the momentun \(p_o\) corresponding to the global coordinate \(x^o\). Indeed, one is a (pre) symplectic geometry, when \(p_o=0\), and the second one, a contact geometry, when \(p_o=/0\).

Sections 5, 6 and 7 are the bulk of the paper. Section 5 is in a broader sense a direct application of Sect. 3. We consider an optimal control system given by \((M, U, X, I, x_a, x_b)\) where \(M = \mathbb {R} \times M_o\), that is, we study the so-called extended system associated to an optimal control problem defined by a vector field depending on controls, X(xu), and a cost function F. Applying Theorem 5 in Sect. 3, we know that this problem is equivalent to solving the dynamics of the presymplectic system \((T^* M \times U, \omega , H)\), where \(X = F \frac{\partial }{\partial x_o} + X^i \frac{\partial }{\partial x^i}\), \(H = F p_o + X^i p_i\) is the linear Hamiltonian given by X, and \(\omega \) is the presymplectic form obtained by lifting the canonical symplectic form, \(\omega _M\in \Omega ^2(T^*M)\), to \(T^*M \times U\). Here, U represents obviously the space of controls. The corresponding presymplectic algorithm provides the solutions, and we can distinguish two cases: the regular one, when the controls can be obtained as functions of the rest of variables, or the singular one, that produces higher order conditions. Again, the evolution of the momentum \(p_o\) is constant, and this permits, as above, to discuss the cases where \(p_o = 0\) or \(p_o \not = 0\). With this in mind, we are able to state the Contact Pontryagin maximum principle (Theorem 4).

Section 6 is just devoted to interpret the Herglotz principle as an Optimal Control Problem, and derive the Herglotz equations of motion using the corresponding Pontryagin principle. In Sect. 7 we state the Herglotz Optimal Control Problem and find the solution equations. In this situation, the extremal condition, given as an integral of the cost function in the classical optimal control problems, is changed into an extremal condition on the solutions of a differential equation on a new variable to be maximized. This problem is a generalization of the classical optimal control systems in the sense that we obtain the classical equations if the cost function and the extremal condition is like in the classical situation. Finally, in Sect. 8 we apply the above results to an example coming from Thermodynamics.

Being aware that in practical applications of optimal control it is necessary to use more general classes of functions and mappings, as it is usual in this kind of theoretical approaches, all the manifolds and mappings are considered as of \(\mathcal {C}^\infty \)-class. The usual Einstein convention for summation indices will be understood unless indicated. As general references for notations and basic results on geometry, mechanics and control we use (Abraham and Marsden 1978; Bullo and Lewis 2005; Bloch 2015).

2 Precontact Hamiltonian Systems

In this section we review the necessary theory of contact manifolds, contact and precontact dynamical systems, in both Hamiltonian and Lagrangian formulations, and Herglotz variational principle and its generalized Euler–Lagrange equations. See Arnold (1978), Bravetti (2017), Bravetti et al. (2017), de León and Lainz-Valcázar (2019), Gaset et al. (2020a), Geiges (2008), Guenther et al. (1996), Lainz-Valcázar and de León (2019) and Liu et al. (2018) for details.

2.1 Contact Manifolds and Hamiltonian Systems

A contact manifold \((M, \eta )\) is a \((2n+1)\)-dimensional manifold equipped with a contact form \(\eta \), that is a 1-form satisfying \(\eta \wedge (\textrm{d}\eta )^n \not = 0\). Then, there exist a unique vector field \(\mathcal {R}\), called the Reeb vector field, such that

$$\begin{aligned} i_{{\mathcal {R}}} \, \textrm{d}\eta = 0 \; , \qquad i_{{\mathcal {R}}}\, \eta = 1. \end{aligned}$$
(1)

Given \((M,\eta )\), there is a Darboux theorem for contact manifolds: around each point in M one can find local Darboux coordinates \((q^i, p_i, z)\) such that

$$\begin{aligned} \eta = \textrm{d}z - p_i \, \textrm{d}q^i,\quad {\mathcal {R}} = \frac{\partial }{\partial z} . \end{aligned}$$
(2)

As an example, and a natural model, we have the extended cotangent bundle \(T^*Q \times \mathbb {R}\) of an n-dimensional manifold Q, which carries a natural contact form

$$\begin{aligned} \eta _Q = \textrm{d}z - \theta _Q, \end{aligned}$$
(3)

where \(\theta _Q\) is the pullback of the Liouville 1-form of \(T^* Q\), \( \theta _Q = p_i \textrm{d}q^i\), being \((q^i,p_i,z)\) the natural bundle coordinates of \(T^*Q \times \mathbb {R}\).

If \((M,\eta )\) is a contact manifold, the map:

$$\begin{aligned} \bar{\flat } : TM&\rightarrow T^* M ,\\ v&\mapsto \iota _{v} \textrm{d}\eta + \eta (v) \eta . \end{aligned}$$

is a vector bundle isomorphism over M.

Given a Hamiltonian function \(H:M \rightarrow \mathbb {R}\), we can define a dynamical system. The triple \((M,\eta ,H)\) is called a contact Hamiltonian system. The associated Hamiltonian vector field \(X_H\) is the solution to the following equation

$$\begin{aligned} \bar{\flat } (X_H) = \textrm{d}H - ({\mathcal {R}} (H) + H) \, \eta . \end{aligned}$$
(4)

In Darboux coordinates, \(X_H\) has the local expression

$$\begin{aligned} X_H = \frac{\partial H}{\partial p_i} \frac{\partial }{\partial q^i} - \left( {\frac{\partial H}{\partial q^i} + p_i \frac{\partial H}{\partial z}}\right) \frac{\partial }{\partial p_i} + \left( {p_i \frac{\partial H}{\partial p_i} - H}\right) \frac{\partial }{\partial z} . \end{aligned}$$
(5)

Therefore, an integral curve \((q^i(t), p_i(t), z(t))\) of \(X_H\) satisfies the differential equations

$$\begin{aligned} \frac{\textrm{d}q^i}{\textrm{d}t}&= \frac{\partial H}{\partial p_i}, \\ \frac{\textrm{d}p_i}{\textrm{d}t}&= - \frac{\partial H}{\partial q^i} - p_i \frac{\partial H}{\partial z},\\ \frac{\textrm{d}z}{\textrm{d}t}&= p_i \frac{\partial H}{\partial p_i} - H. \end{aligned}$$

2.2 Precontact Manifolds and Hamiltonian Systems

Let \(\eta \) be a 1-form on an m-dimensional manifold M. We define the characteristic distribution of \(\eta \) as

$$\begin{aligned} \mathcal {C}= \ker \eta \cap \ker \textrm{d}\eta \subseteq TM , \end{aligned}$$
(7)

which we suppose to be regular, that is, of constant rank. We say that \(\eta \) is a 1-form of class c if the rank of the distribution \(\mathcal {C}\) is \(m-c\). There exist some characterizations of this notion for a 1-form given in the following (Godbillon 1969).

Proposition 1

Let \(\eta \) be a one-form on an m-dimensional manifold M. Then, the following statements are equivalent:

  1. 1.

    The form \(\eta \) is of class \(2r+1\).

  2. 2.

    At every point of M,

    $$\begin{aligned} \eta \wedge {(\textrm{d}\eta )}^r =/0, \quad \eta \wedge {(\textrm{d}\eta )}^{r+1} = 0. \end{aligned}$$
    (8)
  3. 3.

    Around any point of M, there exist local Darboux coordinates \(x^1,\ldots x^r\), \(y_1, \ldots y_r\), z, \(u_1, \ldots u_s\), where \(2r+s+1 = m\), such that

    $$\begin{aligned} \eta = \textrm{d}z - \sum _{i=1}^r y_i \textrm{d}x^i. \end{aligned}$$
    (9)

In these Darboux coordinates, the characteristic distribution of \(\eta \) is given by

$$\begin{aligned} \mathcal {C} = \left\langle \left\{ \frac{\partial }{\partial u^a} \right\} _{a=1,\ldots ,s}\right\rangle . \end{aligned}$$
(10)

A pair \((M,\eta )\) of a manifold M equipped with a form \(\eta \) as above will be called a precontact manifold (see Godbillon 1969). The form \(\eta \) will be called a precontact form.

Remark 1

The distribution \(\mathcal {C}\) is involutive and it gives rise to a foliation of M. If the quotient \(\pi :M \rightarrow M/\mathcal {C}\) has a manifold structure, then there is a unique 1-form \({\tilde{\eta }}\) such that \(\pi ^* \tilde{\eta }=\eta \). From a direct computation, \(\tilde{\eta }\) is a contact form on \(M/\mathcal {C}\). This justifies the name of precontact form.

Given \((M,\eta )\), the following map

$$\begin{aligned} \begin{aligned} {\flat }: TM&\rightarrow T^*M\\ v&\mapsto \iota _{v} \textrm{d}\eta + \eta (v) \eta , \end{aligned} \end{aligned}$$
(11)

is a morphism of vector bundles over M and its kernel is \(\mathcal {C}\).

A Reeb vector field for \((M,\eta )\) is a vector field \(\mathcal {R}\) on M such that

$$\begin{aligned} \iota _{\mathcal {R}} \textrm{d}\eta = 0, \qquad \eta (\mathcal {R}) = 1, \end{aligned}$$
(12)

or, equivalently \(\flat (\mathcal {R}) = \eta \).

We note that there exist Reeb vector fields in every precontact manifold. Indeed we can define local vector fields \(\mathcal {R}= \frac{\partial }{\partial z} \) in Darboux coordinates and can extend it using partitions of unity. However, unlike on contact manifolds, they are not unique. In fact, given a Reeb vector field \(\mathcal {R}\) and any section C of \(\mathcal {C}\), we have that \(\mathcal {R}' = \mathcal {R}+ C\) is another Reeb vector field.

2.2.1 Precontact Hamiltonian Systems and the Constraint Algorithm

A precontact Hamiltonian system is a precontact manifold \((M,\eta )\) with a smooth function \(H:M \rightarrow \mathbb {R}\) called the Hamiltonian. We denote it by \((M,\eta , H)\).

For a precontact Hamiltonian system \((M,\eta , H)\), given a submanifold \(M'\subset M\), a Hamiltonian vector field along \(M'\) is a vector field \(X \in {\mathfrak {X}}(M)\), such that \(X|_{M'}\in {\mathfrak {X}}(M')\) and solution to the equation

$$\begin{aligned} \flat (X) = \textrm{d}H - (H + \mathcal {R}(H)) \eta , \end{aligned}$$
(13)

at the points of \(M'\), and being \(\mathcal {R}\) any Reeb vector field. It can be seen that, if this equation holds for one Reeb vector field, it will hold for all of them.

Notice that, since \(\flat \) is not an isomorphism, then (13) might not have solutions at every point of the manifold M. Furthermore, solutions, if they exists, are not necessarily unique. Indeed, adding a section C of \(\mathcal {C}\) to a solution X gives rise to a new solution \(X' = X + C\). In order to obtain the maximal submanifold along which Hamiltonian vector fields are defined, we can develop a constraint algorithm. To do so, let \(\gamma _H=\textrm{d}H - (H + \mathcal {R}(H)) \eta \in \Omega ^1(M)\) and define inductively \(M_0 = M\), and for any positive integer i,

$$\begin{aligned} M_{i} = \{p \in M_i \mid (\gamma _H)_p \in {\flat }(T_p M_{i-1}) \}, \end{aligned}$$
(14)

where we assume that all \(M_i\) are manifolds.

The algorithm will eventually stop, that is, we will find a positive integer i such that \(M_i = M_{i-1}\). We call this submanifold the final constraint submanifold \(M_f\). If \(M_f\) has positive dimension, there will exist Hamiltonian vector fields along \(M_f\). The pair \((M_f,X)\) will be called a Hamiltonian vector field solution to the Hamiltonian precontact system \((M,\eta , H)\).

A useful characterization of such pairs is given by the following

Proposition 2

X is a Hamiltonian vector field along \(M'\) for \((M,\eta ,H)\) if and only if, at the points of \(M'\),

$$\begin{aligned} \eta (X)&= -H, \end{aligned}$$
(15a)
$$\begin{aligned} \mathop {\textrm{L}}\nolimits _{X} \eta&= g \eta , \end{aligned}$$
(15b)

where \(g:M' \rightarrow \mathbb {R}\). Moreover, if this holds, then \(g = - \mathcal {R}(H)\) for any Reeb vector field \(\mathcal {R}\).

Proof

Let X be a Hamiltonian vector field along \(M'\). By the definition of \(\flat \), equation (13), at the points of \(M'\), becomes

$$\begin{aligned} \iota _{X} \textrm{d}\eta + \eta (X) \eta = \textrm{d}H - (H + \mathcal {R}(H)) \eta , \end{aligned}$$
(16)

and, by contraction with \(\mathcal {R}\), we obtain

$$\begin{aligned} \eta (X) = -H. \end{aligned}$$
(17)

Combining (16) and (17), we deduce

$$\begin{aligned} \iota _{X} \textrm{d}\eta + \textrm{d}\iota _{X} \eta = - \mathcal {R}(H) \eta , \end{aligned}$$
(18)

but the left-hand side of this equation equals \(\mathop {\textrm{L}}\nolimits _X \eta \) by Cartan’s formula, hence X fulfills (2) at the points of \(M'\).

Now assume that X satisfies (2) on the points of \(M'\). Once again, by contraction of (15b) with a Reeb vector field \(\mathcal {R}\), we have

$$\begin{aligned} g = \iota _{\mathcal {R}} \mathop {\textrm{L}}\nolimits _X (\eta ) = \iota _{\mathcal {R}} (\iota _{X} \textrm{d}\eta + d (\eta (X)) ) = - \iota _{\mathcal {R}}( \textrm{d}H) = -\mathcal {R}(H). \end{aligned}$$
(19)

Combining this with (2), we can easily retrieve (16). \(\square \)

2.2.2 Morphisms of Precontact Hamiltonian Systems

Let \((M,\eta , H)\) and \((\bar{M},\bar{\eta },\bar{H})\) be precontact Hamiltonian systems. A map \(F:M \rightarrow \bar{M}\) is said to be a conformal morphism of precontact systems if \(F^* \bar{\eta } = f \eta \) and \(F^* \bar{H} = f H\) for some non-vanishing function \(f:M\rightarrow \mathbb {R}\). If \(f=1\), we say that F is a strict morphism of precontact systems.

Theorem 1

Let \(F:M \rightarrow \bar{M}\) be a conformal morphism of precontact systems. Assume that \(X, \bar{X}\) are F-related vector fields defined along submanifolds \(M' \subseteq M\) and \(\bar{M}' = F(M') \subseteq \bar{M}\), respectively. Therefore, if \(\bar{X}\) is a Hamiltonian vector field along \({\bar{M}}'\), then X is also a Hamiltonian vector field along \(M'\).

Proof

Since \(\bar{X}\) is a Hamiltonian vector field, its satisfies (2) along \(\bar{M}'\)

$$\begin{aligned} \bar{\eta }(\bar{X})&= -\bar{H}, \end{aligned}$$
(20a)
$$\begin{aligned} \mathop {\textrm{L}}\nolimits _{\bar{X}} \bar{\eta }&= \bar{g} \bar{\eta }. \end{aligned}$$
(20b)

Pulling back by F, we obtain

$$\begin{aligned} f \eta (X)&= -f H , \end{aligned}$$
(21a)
$$\begin{aligned} \mathop {\textrm{L}}\nolimits _{{X}} (f {\eta })&= (\bar{g}\circ F) f {\eta }. \end{aligned}$$
(21b)

From this expression, we obtain

$$\begin{aligned} \eta (X)&= - H , \end{aligned}$$
(22a)
$$\begin{aligned} \mathop {\textrm{L}}\nolimits _{X} ( {\eta })&= g {\eta }, \end{aligned}$$
(22b)

where \(g = \bar{g}\circ F - (\mathop {\textrm{L}}\nolimits _{X} f)/f\). Hence X is a Hamiltonian vector field. \(\square \)

Observe that if F is a diffeomorphism, then we have a bijective correspondence between pairs of Hamiltonian vector fields along submanifolds.

2.3 The Lagrangian Formalism

Unlike \(T^*Q \times \mathbb {R}\), the manifold \(TQ \times \mathbb {R}\) does not have a canonical contact structure. However, given a Lagrangian function \(L:TQ \times \mathbb {R}\rightarrow \mathbb {R}\) one can construct the 1-form

$$\begin{aligned} \eta _{\textrm{L}} = \textrm{d}z - \theta _{\textrm{L}}, \end{aligned}$$
(23)

where \(\theta _{\textrm{L}}\) is the associated Lagrangian 1-form, which in bundle coordinates \((q^i, v^i, z)\) is written as

$$\begin{aligned} \theta _{\textrm{L}} = \frac{\partial L}{\partial v^i} \textrm{d}q^i. \end{aligned}$$
(24)

The Lagrangian L is said to be regular if its Hessian matrix with respect to the velocities,

$$\begin{aligned} (W_{ij}) = \left( \frac{\partial ^2 L}{\partial v^i \partial v^j} \right) , \end{aligned}$$
(25)

is regular.

One can see that \(\eta _{\textrm{L}}\) is contact form when L is regular. Furthermore, \(\eta _{\textrm{L}}\) is a precontact form when \((W_{ij})\) has constant rank (see de León and Lainz-Valcázar 2019, Section).

The energy of the Lagrangian is \(E_{\textrm{L}} = \Delta (L) - L\) where \(\Delta \) is the canonical Liouville vector field on TQ, \( \Delta = v^i \frac{\partial }{\partial v^i}\), extended in the usual way to \(TQ \times \mathbb {R}\) with the same local expression.

Hence, provided L is such that \((W_{ij})\) has full (resp. constant) rank we have that \((TQ \times \mathbb {R}, \eta _{\textrm{L}}, E_{\textrm{L}})\) is a contact (resp. precontact) Hamiltonian system. Let \(\xi _{\textrm{L}}\) be a Hamiltonian vector field for this contact or precontact system. From a direct computation one can see that every integral curve \((q^i(t), v^i(t), z(t))\) of \(\xi _{\textrm{L}}\) is a solution of the Herglotz equations:

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t} \left( \frac{\partial L}{\partial v^i}\right) - \frac{\partial L}{\partial q^i} = \frac{\partial L}{\partial v^i} \frac{\partial L}{\partial z}, \end{aligned}$$
(26)

and \({\dot{z}}(t) = L(q^i(t), v^i(t), z(t))\). These equations are also called generalized Euler–Lagrange equations.

Notice that, in the contact case, \(\bar{\xi }_{\textrm{L}}\) is a second order differential equation, a SODE, meaning that its integral curves satisfy \(v^i(t) = {\dot{q}}^i(t)\) . In the precontact case, the situation is more subtle. If there exist solutions, which are not necessarily unique, there is at least one which is a SODE. The details are explained in de León and Lainz-Valcázar (2019, Section 10).

2.3.1 The Herglotz Variational Principle

The integral curves of a contact Lagrangian system can also be obtained from a variational principle. Unlike in the case of Hamilton’s principle, the action is not an integral of the Lagrangian, but it is given by an ordinary differential equation on a new variable z.

Given a Lagrangian function, \(L:TQ \times \mathbb {R}\rightarrow \mathbb {R}\), for \(q_o,q_1\in Q\), we consider the set \(\Omega (q_0,q_1)\) of curves \(\gamma :[a,b] \rightarrow \mathbb {R}\) such that \(\gamma (a) = q_0\), \(\gamma (b) = q_1\); and fix \(z_0 \in \mathbb {R}\). We define the functional

$$\begin{aligned} \mathcal {Z}:\Omega (q_0,q_1) \rightarrow \mathcal {C}^\infty ([a,b] \rightarrow \mathbb {R}), \end{aligned}$$
(27)

which assigns to each curve \(\gamma \) the curve \(\mathcal {Z}(\gamma )\) that solves the following ODE:

$$\begin{aligned} \begin{aligned} \frac{\textrm{d}\mathcal {Z}(c)}{\textrm{d}t}&= L(c, {\dot{c}}, \mathcal {Z}(c)),\\ \mathcal {Z}(\gamma )(a)&= z_0. \end{aligned} \end{aligned}$$
(28)

Finally, the action is given by evaluating the solution at the endpoints:

$$\begin{aligned} \begin{aligned} \mathcal {A}:\Omega (q_0,q_1) \rightarrow \mathbb {R}, \gamma&\mapsto \mathcal {Z}(\gamma )(b) \end{aligned} \end{aligned}$$
(29)

Using techniques from calculus of variations (de León and Lainz-Valcázar 2019, Section 5), one can proof the following:

Theorem 2

(Contact variational principle) Let \(L: TQ \times \mathbb {R}\rightarrow \mathbb {R}\) be a Lagrangian function and let \(\gamma \in \Omega (q_o,q_1)\). Then, \((\gamma ,{\dot{\gamma }}, \mathcal {Z}(\gamma ))\) satisfies the Herglotz’s equations (26) if and only if \(\gamma \) is a critical point of \(\mathcal {A}\).

These Herglotz equations, called also generalized Euler–Lagrange equations, are

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\left( \frac{\partial L}{\partial v^i}\right) _\gamma -\frac{\partial L}{\partial q^i} -\frac{\partial L}{\partial z}\frac{\partial L}{\partial v^i}=0 . \end{aligned}$$

Observe that they are not linear on the Lagrangian.

In Sect. 6 we provide a new proof of this last statement based on the Pontryagin maximum principle.

3 A Quick Survey on Optimal Control and Pontryagin Maximum Principle

Roughly speaking, for our interest the Pontryagin maximum principle, PMP, transforms an optimal control problem into a presymplectic one. The method is to mimic the lifting of a vector field, \(X\in \mathfrak {X}(M)\), to the cotangent bundle, \(T^*M\), using the Hamiltonian function associated to the natural operation, by duality, of the vector field X on the cotangent bundle. This is done for a control depending vector field but in the particular case where the original manifold M is the product \(M=\mathbb {R}\times M_o\) where \(M_o\) is a manifold.

This Section tries to introduce what is an optimal control problem and how works the Pontryagin maximum principle in the different situations that we are interested in. For a clearest exposition, we suppose that all the manifolds and mappings are of \(\mathcal {C}^{\infty }\)-class.

Since the original result and proof of Pontryagin and collaborators, Pontryagin et al. (1962), there are numerous expositions with applications and proofs on the Pontryagin principle; in this review we follow Barbero-Liñan and Muñoz-Lecanda (2009) for notations and statements. There a detailed proof is given and a extensive bibliography is included.

3.1 The Optimal Control Problem

3.1.1 Statement of the Problem

Consider the diagram:

with the following elements:

  1. 1.

    \(M_o\) is a differentiable manifold, \(\dim M_o=m_o\). It is the state space for the vector field \(X_o\). The points in \(M_o\) will be denoted by x and, when necessary, the coordinates in \(M_0\) will be denoted by \((x^i)\).

  2. 2.

    \(U\subset \mathbb {R}^k\) is called the control set. Its elements are denoted by u, the controls, and we denote by \((u^a)\) its local coordinates, that is \(u=(u^1,\ldots ,u^k)\).

  3. 3.

    \(X_o\) is a vector field along the projection \(M_o\times U\rightarrow M_o\). Given \(u\in U\) we denote by \(X_o^u=X_o(\, .\,,u)\in \mathfrak {X}(M_o)\). It gives the dynamics of the problem.

Suppose that we have given a function \(F:M_o\times U\rightarrow \mathbb {R}\), an interval \(I=[a,b]\subset \mathbb {R}\) and \(x_a,x_b\in M_o\). With all these elements \((M_o,U,X_o,F,I,x_a,x_b)\) we have the following

Optimal control problem, OCP: Find curves \(\gamma :I\rightarrow M_o\times U\), \(\gamma =(\gamma _o,\gamma _U)\), such that

  1. (1)

    end points conditions: \(\gamma _o(a)=x_a, \gamma _o(b)=x_b\),

  2. (2)

    \(\gamma \) is an integral curve of \(X_o\): \(\dot{\gamma }_o=X_o\circ \gamma \), and

  3. (3)

    minimal condition: \(S[\gamma ]=\int _a^b F(\gamma (t))\textrm{d}\,t\) is minimum over all curves satisfying (1) and (2).

The function F is called the cost function of the problem.

In local coordinates, if \(X =X^i\frac{\partial }{\partial x^i}\), then the differential equation for the curve \(\gamma \) are

$$\begin{aligned} {\dot{x}}^i =X^i(x^j,u) . \end{aligned}$$

The minimal condition allows to obtain the solution for the controls \(u=u(t)\). Introducing them in the differential equation and integrating them we have the curves solution of the optimal control problem.

3.1.2 The Extended Optimal Control Problem

To solve the above problem it is necessary to incorporate into the vector field the cost function as a direction in the tangent bundle of the state space. This is made by the construction of the so called extended problem.

Associated with the previous elements, consider the diagram:

where the points in \(M=\mathbb {R}\times M_o\) are denoted by \((x^o,x)\), and the vector field X along the projection \(\pi _1\) is

$$\begin{aligned} X=F\frac{\partial }{\partial x^o}+X_o . \end{aligned}$$

Remark 2

Observe that \([\partial /\partial x^o,X]=0\), hence we are in a situation where the direction associated to \(x^o\) is specifically identified. In particular this implies that the vector field X is projectable to \(M_o\). This situation is going to be used in other parts of this and other sections.

From the original elements we have at the beginning, \((M_o,U,X_o,F,I,x_a,x_b)\), we now have \((M,U,X,I,x_a,x_b)\) and we consider the following problem:

Extended optimal control problem, EOCP: Find curves \({\hat{\gamma }}:I\rightarrow \mathbb {R}\times M_o\times U\), \({\hat{\gamma }}=(\gamma ^o, \gamma _o,\gamma _U)\), such that

  1. (1)

    end points conditions: \(\gamma _o(a)=x_a, \gamma _o(b)=x_b, \gamma ^o(a)=0\),

  2. (2)

    \({\hat{\gamma }}\) is an integral curve of X: \(\dot{\overline{(\gamma ^o,\gamma _o)}}=X\circ {\hat{\gamma }}\), and

  3. (3)

    maximal condition: \(x^o(b)\) is maximal over all curves satisfying (1) and (2).

Remember that F is the cost function of the original optimal control problem.

This extended optimal control problem is equivalent to the initial optimal control problem as defined above, that is there is a bijection between the set of solutions \(\gamma \) of the first problem and the set of solution \({\hat{\gamma }}\) of the second one corresponding to the variables \(x^1,\ldots ,x^{m_o}\). The variable \(x^o\) is not relevant to the problem, it is an additional variable used to identify the direction with maximal increment in the tangent bundle to M and to prove the Pontryagin maximum principle.

In the sequel we only consider this form of the optimal control problem and we always refer to this statement as optimal control problem. We denote it by \((M,U,X,I,x_a,x_b)\).

3.2 The Pontryagin Maximum Principle

As we have said above, the solution to this problem was obtained by Pontryagin and collaborators in 1954. For a modern proof and applications, see Barbero-Liñan and Muñoz-Lecanda (2009) and references therein.

Given the above optimal control problem \((M,U,X,I,x_a,x_b)\), for any \(u\in U\), we consider the symplectic problem given by

  1. 1.

    Manifold: \(T^*M\).

  2. 2.

    Symplectic form \(\omega _M\), the 2-canonical form of \(T^*M\).

  3. 3.

    Hamiltonian function: \(H^u={\hat{X}}^u=p_oF^u+p_i(X^u)^i\).

Where we have denoted by \(X^u\) the vector field \(X(\, .\, , u)\), and similarly with the other elements. The Hamiltonian function is the natural one associated to the vector field \(X^u\) on the cotangent bundle \(T^*M\). We call this problem \((T^*M,\omega _M,H^u)\). It is a Hamiltonian symplectic system.

As we know, the associated Hamiltonian vector field, \(X_{\textrm{H}}^u\), defined by \(\textbf{i}(X_{\textrm{H}}^u)\omega _M=\textrm{d}\,H^u\), is locally given by

$$\begin{aligned} X_{\textrm{H}}^u=F^u\frac{\partial }{\partial x^o}+(X^u)^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial F^u}{\partial x^i}+ p_j\frac{\partial (X^u)^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} . \end{aligned}$$
(30)

This is no more than the canonical lifting of a vector field X on a manifold M to its cotangent bundle \(T^*M\) and denoted usually by \(X^*\), in this particular case \((X^u)^*\). We will go on these ideas on the following section with more detail and other points of view.

With this in mind we have: see Barbero-Liñan and Muñoz-Lecanda (2009) for a detailed proof

Theorem 3

(Pontryagin maximum principle) Given the optimal control problem \((M,U,X,I,x_a,x_b)\), let \({\hat{\gamma }}:I\rightarrow \mathbb {R}\times M_o\times U\) be an optimal solution, \({\hat{\gamma }}=(\gamma _{M},\gamma _U)\), then there exists \({\hat{\sigma }}:I\rightarrow T^*M\times U=T^*\mathbb {R}\times T^*M_o\times U\), \({\hat{\sigma }}=(\sigma _{T^*M},\sigma _U)\) such that

  1. (1)

    it is a solution to the Hamiltonian problem \((T^*M\times U,\omega ,H^u)\), that is, an integral curve of \(X_{\textrm{H}}^u\), for some fixed \(u\in U\),

  2. (2)

    \({\hat{\gamma }}=\pi \circ {\hat{\sigma }}\), where \(\pi :T^*M\times U\rightarrow M\times U\) is the natural projection, and \({\hat{\gamma }}\) satisfies the end points condition; hence \(\sigma _U=\gamma _U\),

  3. (3)

    \(H(\sigma _{T^*M}(t), \gamma _U(t))=\textrm{sup}_{v(t)\in U}H(\sigma _{T^*M}(t), v(t))\) for every \(t\in I\).

This Theorem gives a necessary condition the solutions must fulfill. The way it is applied is as follows: condition (3) allows to obtain the solution for u(t) and with this solution we can integrate the Hamiltonian vector field \(X_{\textrm{H}}^u\), obtaining the curves \({\hat{\sigma }}(t)\) and hence \({\hat{\gamma }}(t)\) and the initially desired solution \(\gamma _o(t)\).

The differential equations defining the integral curves of \(X_{\textrm{H}}^u\) are the following:

$$\begin{aligned} \begin{aligned} {\dot{x}}^o=\frac{\partial H^u}{\partial p_o}=F, \quad&{\dot{p}}_o= \frac{\partial H^u}{\partial x^o}=0,\,\,(\Rightarrow p_o=ct) \\ {\dot{x}}^i=\frac{\partial H^u}{\partial p_i}=X^i, \quad&{\dot{p}}^i=-\frac{\partial H^u}{\partial x^i}=-p_o\frac{\partial F}{\partial x^i}-p_j\frac{\partial X^j}{\partial x^i} \end{aligned} \end{aligned}$$
(31)

As we are assuming that all the elements of the problem are of \(\mathcal {C}^{\infty }\)-class, and we suppose furthermore that \(U\subset \mathbb {R}^k\) is an open set, then condition (3) in the Theorem can be changed to

(3\('\)) \(\frac{\partial H}{\partial u}|_{{\hat{\sigma }}(t)}=0\) for every \(u\in U\).

Hence in order to obtain the solution \(\gamma _U\), if possible, we have this last expression as other equations to add to (31). If \((u^1,\ldots ,u^k)\) is a basis for \(\mathbb {R}^k\), we have the equations

$$\begin{aligned} \frac{\partial H}{\partial u^1}=0,\ldots ,\frac{\partial H}{\partial u^k}=0 \end{aligned}$$
(32)

together with Eq. (31) to solve the optimal control problem.

In the sequel we will assume that U is an open subset of \(\mathbb {R}^k\).

Then instead of Theorem 3, we have the following

Theorem 4

(Weak Pontryagin maximum principle) Given the optimal control problem \((M,U,X,I,x_a,x_b)\), with \(U\subset \mathbb {R}^k\) an open set, let \({\hat{\gamma }}:I\rightarrow \mathbb {R}\times M_o\times U\) be a solution, \({\hat{\gamma }}=(\gamma _{M},\gamma _U)\), then there exists \({\hat{\sigma }}:I\rightarrow T^*M\times U=T^*\mathbb {R}\times T^*M_o\times U\), \({\hat{\sigma }}=(\sigma _{T^*M},\sigma _U)\) such that

  1. (1)

    it is a solution to the Hamiltonian problem \((T^*M\times U,\omega ,H^u)\), that is, it is an integral curve of \(X_{\textrm{H}}^u\), for any fixed \(u\in U\),

  2. (2)

    \({\hat{\gamma }}=\pi \circ {\hat{\sigma }}\), where \(\pi :T^*M\times U\rightarrow M\times U\) is the natural projection, and \({\hat{\gamma }}\) satisfies the end points condition; hence \(\sigma _U=\gamma _U\),

  3. (3)

    minimality conditions: \(\frac{\partial H}{\partial u}|_{{\hat{\sigma }}(t)}=0\) for every \(u\in U\) and for every \(t\in I\).

3.3 The Presymplectic Approach to PMP

Now we try to give another approach to the Pontryagin maximum principle more adequate for our problems. It is stated as a presymplectic problem and goes as follows.

Consider the problem given by \((M,U,X,I,x_a,x_b)\) and the solution by means of the symplectic system \((T^*M,\omega _M,H^u)\) with Eqs. (31) and (32). Take the projection

$$\begin{aligned} \pi _1:T^*M\times U\rightarrow T^*M \end{aligned}$$

and the 2-form \(\omega =\pi _1^*\,\omega _M\in \Omega ^2(T^*M\times U)\). It is a presymplectic form and its kernel is given by

$$\begin{aligned} \ker \omega =\left\{ \frac{\partial }{\partial u^1},\ldots ,\frac{\partial }{\partial u^k}\right\} . \end{aligned}$$

We can consider the presymplectic system \((T^*M\times U,\omega , H)\) whose dynamical equation is given by

$$\begin{aligned} \textbf{i}(X_{\textrm{H}})\,\omega =\textrm{d}\, H . \end{aligned}$$

Being a presymplectic system, the compatibility equations are given by \(\textbf{i}(Z)\textrm{d}\,H=0\) for every \(Z\in \ker \omega \), that is Eq. (32).

Changing Theorem 4 to this new situation we have

Theorem 5

(Presymplectic Pontryagin maximum principle) Given the optimal control problem \((M,U,X,I,x_a,x_b)\), with \(U\subset \mathbb {R}^k\) an open set, let \({\hat{\gamma }}:I\rightarrow M\times U=\mathbb {R}\times M_o\times U\) be a solution, \({\hat{\gamma }}=(\gamma _M,\gamma _U)\), then there exists \({\hat{\sigma }}:I\rightarrow T^*M\times U=T^*\mathbb {R}\times T^*M_o\times U\), \({\hat{\sigma }}=(\sigma _{T^*M},\sigma _U)\) such that

  1. (1)

    it is a solution to the Hamiltonian presymplectic problem \((T^*M\times U,\omega ,H)\), that is it is an integral curve of \(X_{\textrm{H}}\), solution to the equation \(\textbf{i}(X_{\textrm{H}})\,\omega =\textrm{d}\, H\),

  2. (2)

    \({\hat{\gamma }}=\pi \circ {\hat{\sigma }}\), where \(\pi :T^*M\times U\rightarrow M\times U\) is the natural projection, and \({\hat{\gamma }}\) satisfies the end points condition; hence \(\sigma _U=\gamma _U\),

  3. (3)

    minimality, compatibility, conditions: \(\frac{\partial H}{\partial u}|_{{\hat{\sigma }}(t)}=0\) for every \(u\in U\) and for every \(t\in I\).

A solution to the equation \(\textbf{i}(X_{\textrm{H}})\,\omega =\textrm{d}\, H\) is given by:

$$\begin{aligned} X_{\textrm{H}}=F\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial F}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} . \end{aligned}$$
(33)

Observe that this solution exists all over the manifold \(T^*M\times U\) and that \(p_o\) is constant for every curve solution to the problem.

Suppose that the compatibility equations allow us to determine the controls \(u^1,\ldots ,u^k\), that is we can obtain \(u^a= \psi (x^o,x^i,p_o,p_i)\), then we say that the optimal control problem is regular, otherwise it is called singular. In the singular case, it is necessary to apply an algorithm of constraints, that is to go to higher order conditions, to obtain the controls perhaps on a submanifold of \(T^*M\times U\). See Barbero-Liñan and Muñoz-Lecanda (2009, 2012) for details on these ideas and Gotay and Nester (1979) for the used algorithm.

Note that the weak and the presymplectic approaches to the maximum principle are equivalent since the local equations are the same.

Remark 3

Along this appendix and for simplicity in the exposition, we have considered that the set of controls U is an open set in an Euclidean space, hence we have the product \(M\times U\). We can change this situation by a non-trivial bundle \(C\rightarrow M\), instead of the natural projection \(M\times U\rightarrow M\), considering the controls as the elements of the fibres. The local equations are the same that we have obtained in the trivial case for the controls.

4 Dynamics of Vector Fields as Contact Dynamics

It is well known that the integral curves of a vector field in a manifold M can be obtained as projection of integral curves of a Hamiltonian vector field in the cotangent bundle. We can extend this dynamics to the contact associated manifold \(TM\times \mathbb {R}\), as in Eq. (3), what gives the additional equation \({\dot{z}}=0\), that is in a trivial way. We want to obtain a non-trivial extension.

In this section we study how to obtain these integral curves as solutions of a contact dynamical system in an adequate contact manifold, at least in the case that the original vector field has some symmetry properties. Here we recover a similar situation we had in the Pontryagin maximum principle in its symplectic approach. See Sect. 3.

4.1 The General Case

Let M be a manifold and \(X\in \mathfrak {X}(M)\) a vector field. Let \({\hat{X}}:T^*M\rightarrow \mathbb {R}\) the natural function defined by \({\hat{X}}(\alpha )=\alpha (X)={<}\alpha ,X{>}\). In a canonical coordinate system \((x^i,p_i)\) in \(T^*M\), we have that \({\hat{X}}(x,p)=p_iX^i\).

As it is well known, if \(\omega _M=-\textrm{d}\,\theta _M\) is the symplectic canonical 2-form in \(T^*M\), we can consider the Hamiltonian symplectic system \((T^*M,\omega _M,{\hat{X}})\). Then the Hamiltonian vector field \(Y_{{\hat{X}}}\in \mathfrak {X}(T^*M)\), defined by \(\textbf{i}(Y_{{\hat{X}}})\omega _M=\textrm{d}\,\hat{X}\), has local expression

$$\begin{aligned} X=X^i\frac{\partial }{\partial x^i},\,\,\,\Rightarrow \,\,\, Y_{{\hat{X}}}=X^i\frac{\partial }{\partial x^i}-p_j\frac{\partial X^j}{\partial x^i}\frac{\partial }{\partial p_i} \end{aligned}$$

if \((x^i)\) and \((x^i,p_i)\) are coordinates of M and \(T^*M\) respectively. By this local expression we have that \(Y_{{\hat{X}}}=X^*\), where \(X^*\) is the so-called canonical lifting of \(X\in {\mathfrak {X}}(M)\) to \(T^*M\). The integral curves of \(Y_{{\hat{X}}}\) projected to M are the integral curves of X as we can see by direct observation of the above local expression. With this method, we have transformed any vector field in a Hamiltonian one but doubling the dimension. For details about these constructions we refer to de León and Rodrigues (1989), Yano and Ishihara (1973).

Observe that the Hamiltonian \({\hat{X}}\) depends linearly on the momenta.

4.2 The Case \(M=\mathbb {R}\times M_o\)

In this Section we analyze the specific case where in the manifold M there is a particular “direction”, that is \(M=\mathbb {R}\times M_0\). This situation allows us to split up the integral curves of a vector field, with a symmetry property, into two different classes: one following a symplectic geometry and the other class under a contact geometry. This study is connected with the ideas developed in Ohsawa (2015) where the contact structure is associated to the projective manifold associated to \(T^*M\).

4.2.1 The Symplectic Case

Suppose now that we have one direction specially identified in the tangent bundle to the manifold, that is \(M=\mathbb {R}\times M_o\). When necessary we denote by \((x^o,x^i)\) a coordinate system in M and \((x^o,x^i,p_o,p_i)\) its natural extension to \(T^*M\).

Let \(X\in \mathfrak {X}(M)\) and suppose that

$$\begin{aligned} \left[ \frac{\partial }{\partial x^o},X\right] =0. \end{aligned}$$

In coordinates this means that, if \(X=X^o\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}\), then the coordinates \(X^o\) and \(X^i\) of the vector field X do not depend on \(x^o\). In particular this implies that X is projectable to \(M_o\).

Remark 4

What is the meaning of this situation? Suppose we have two vector fields \(X_o,X\in {\mathfrak {X}}(M)\) with \([X_o,X]=0\). Then around any regular point of \(X_o\) we can choose a local coordinate system \((U,x^o,x^i)\), with \(i=1,\ldots ,n\), if \(\dim M=1+n\), and \(U\subset M\) an open set, with \(X_o|_U=\partial /\partial x^o\). Hence we have the above situation but locally. In this case the local decomposition \(\{x^o\}\times \{x ^i\}\) is not unique.

This is what we called above “particular symmetry property” for the vector field X. We can observe that it is a common situation at least locally.

This is a situation we are going to tackle when trying to relate contact structures and optimal control. The variable \(x^o\) will correspond to the cost function F as we have seen in Sect. 3 in our review of the Pontryagin maximum principle.

If we proceed in this case as above in the general situation, with \({\textbf{i}}(X^*)\omega _M=\textrm{d}\, H\), where the Hamiltonian function H is defined by

$$\begin{aligned} H=\hat{X}= p_oX^o+p_iX^i \end{aligned}$$

then the corresponding Hamiltonian vector, using \( [\partial /\partial x^o,X]=0 \), is given by

$$\begin{aligned} X^*=X^o\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}-0\frac{\partial }{\partial p_o}-\left( p_o\frac{\partial X^o}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} . \end{aligned}$$

The associated system of differential equations is:

$$\begin{aligned} {\dot{x}}^o=X^o,\,\,{\dot{p}}_o=0,\,\, {\dot{x}}^i=X^i,\,\, {\dot{p}}^i=-p_o\frac{\partial X^o}{\partial x^i}-p_j\frac{\partial X^j}{\partial x^i} \end{aligned}$$

This is the description of the Hamiltonian system \((T^*M,\omega _M,H)\) with \(H=\hat{X}\).

4.2.2 The Relation with Contact Dynamics

Observe that the vector field \(X^*\) is tangent to the submanifold defined by \(p_o=\textrm{constant}\), hence we can reduce the problem to those hypersurfaces of \(T^*M\). We have two different situations and, by comparison with the situation of the optimal control and the symplectic Pontryagin maximum principle, we will call normal and abnormal situations.

(a) The normal situation \(p_o=/0\)

For \(\lambda _o\in \mathbb {R}\), \(\lambda _o=/0\), let \(N\subset T^*M\) be the submanifold defined by \(p_o=\lambda _o\) and let \(j:N\hookrightarrow T^*M\) be the natural inclusion. Obviously the dimension of N is odd, hence it cannot be a symplectic manifold. We denote by \((x^o,x^i,p_i)\) the coordinates induced in N by the coordinates we have in \(T^*M\).

Consider now the canonical 1-form \(\theta _M\in \Omega ^1(T^*M)\) and let \( \eta =-j^*\theta _M\), then we have the following result

Lemma 1

\((N,\eta )\) is a contact manifold. The Reeb vector field is \(R=-\frac{1}{\lambda _o}\frac{\partial }{\partial x^o}\).

The proof is direct using its local expression, \(\eta =-\lambda _o\textrm{d}\,x^o-p_i\textrm{d}\, x^i\). The minus sign comes from a convention in the definition of the symplectic form in \(T^*M\) and the 1-form and 2-form in a contact manifold.

Let \(H_N=j^*H\) be the restriction of H to N. We have that , locally, \(H_N= \lambda _oX^o+p_iX^i\) and we have a Hamiltonian contact system given by \((N,\eta ,H_N)\). Let \(X_N\in \mathfrak {X}(N)\) be the corresponding contact Hamiltonian vector field, that is:

$$\begin{aligned} {\textbf{i}}(X_N)\eta =-H_N,\,\,\,{\textbf{i}}(X_N)\textrm{d}\,\eta =\textrm{d}\, H_N-(L(R)H_N)\eta \, \end{aligned}$$

whose local expression is

$$\begin{aligned} X_N=X^o\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial X^o}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} , \end{aligned}$$
(34)

with the usual notation confusing the functions on \(T^*M\) and their restrictions to N.

With this in mind, we have that:

Theorem 6

The vector field \(X^*\in \mathfrak {X}(T^*M)\) is tangent to N and, on the points of N, it is equal to \(X_N\).

Hence the normal integral curves to the vector field \(X^*\) are solutions of a Hamiltonian contact dynamics on a corresponding contact manifold. The contact system is \((N,\eta , H_N)\).

Comment: A little calculus

Here we give the corresponding calculus to obtain the expression in (34).

We have that \(H_N= \lambda _oX^o+p_iX^i\) and \(\eta =-\lambda _o\textrm{d}\,x^o-p_i\textrm{d}\, x^i\). Denoting \(X_N\) by

$$\begin{aligned} X_N=a^o\frac{\partial }{\partial x^o}+a^i\frac{\partial }{\partial x^i}+b_i\frac{\partial }{\partial p_i} \end{aligned}$$

the first contact dynamical equation is:

$$\begin{aligned} {\textbf{i}}(X_N)\eta =-H_N\,\Rightarrow \,-\lambda _o a_o-a^i p_i=-\lambda _o X^o - p_i X^i \end{aligned}$$

and the second one

$$\begin{aligned} {\textbf{i}}(X_N)\textrm{d}\,\eta= & {} \textrm{d}\, H_N-(L(R)H_N)\eta \,\Rightarrow \, \\ -b_i\textrm{d}\,x^i+a^i\textrm{d}\,p_i= & {} \lambda _o\frac{\partial X^o}{\partial x^i}\textrm{d}\,x^i+X^i\textrm{d}\,p_i+ p_j\frac{\partial X^j}{\partial x^i}\textrm{d}\,x^i . \end{aligned}$$

Hence

$$\begin{aligned} a^i=X^i,\,\, b_i=-\lambda _o\frac{\partial X^o}{\partial x^i}- p_j\frac{\partial X^j}{\partial x^i},\,\, a^o=X^o \end{aligned}$$

as we wanted.

(b) The abnormal situation \(p_o=0\)

This case corresponds to \(\lambda _o=0\) and the submanifold \(N_o\subset T^*M\) defined by \(p_o=0\). Let \(j_o:N_o\hookrightarrow T^*M\) be the natural inclusion and \(\eta _o=j_o^*\theta _M\).

Observe that \(\eta _o=-p_i\textrm{d}\,x^i\) is not a contact form. In fact, as \(m_o=\dim M_o\), we have that \(\eta _o\wedge (\textrm{d}\,\eta _o)^{m_o-1}=/0\), but \(\eta _o\wedge (\textrm{d}\,\eta _o)^{m_o}=0\).

We can consider the 2-form \(\omega _o=\textrm{d}\, \eta _o\), the Hamiltonian \(H_o=j_o^*H\) and the presymplectic manifold \((N_o,\omega _o,H_o)\). Observe that \(\ker \omega _o=\{\frac{\partial }{\partial x^o}\}\). The Hamiltonian presymplectic equation

$$\begin{aligned} \textbf{i}(X_o)\omega _o= \textrm{d}\,H_o \end{aligned}$$

gives the solution

$$\begin{aligned} X_o= X^i\frac{\partial }{\partial x^i}- p_j\frac{\partial X^j}{\partial x^i}\frac{\partial }{\partial p_i}+A\,\frac{\partial }{\partial x^o}, \end{aligned}$$

where A is arbitrary and corresponds to \(\ker \omega _o\). In fact we have that \({\dot{x}}^o=A\).

It does not exist any constraint because the vector field \(X_o\) is defined on the whole manifold \(N_o\). This is because the only constraint is given by \(L_T H_ o=0\) with \(T\in \ker \omega _o\) and this is fulfilled globally on \(N_o\).

Comment: Observe that \(T^*M=\bigcup _{\lambda \in \mathbb {R}}N_{\lambda }\,\), hence with this decomposition we obtain all the solutions of the initial Hamiltonian problem on \(T^*M\) given by the Hamiltonian H.

5 The Contact Dynamics Approach to Pontryagin Maximum Principle

Following the ideas of the previous sections, we study a contact approach to the Pontryagin maximum principle, in particular to the so-called normal solutions to the optimal control problem. In particular we will obtain the normal solutions of an optimal control problem as projection of the integral curves of a Hamiltonian contact system in adequate manifolds. The abnormal solution can be obtained with another different approach given at the end of this section.

5.1 Statement of the Problem

Let \((M,U,X,I,x_a,x_b)\) be an optimal control problem. We know by Theorem 5 that to solve this problem we need to study the associated Hamiltonian presymplectic system \((T^*M\times U,\omega ,H)\), that is to obtain an integral curve of the vector field \(X_{\textrm{H}}\) solution to the equation \(\textbf{i}(X_{\textrm{H}})\,\omega =\textrm{d}\, H\), where

$$\begin{aligned} \omega =\pi _1^*\omega _o=\textrm{d}x^o\wedge \ \textrm{d}p_o+\textrm{d}x^i\wedge \textrm{d}p_i,\quad H={\hat{X}}=p_oF+p_iX^i \end{aligned}$$

and \(\pi _1:T^*M\times U\rightarrow T^*M\). Recall that \(\ker \omega =\left\{ \partial /\partial u^a\right\} \).

The solution to the equation \(\textbf{i}(X_{\textrm{H}})\,\omega =\textrm{d}\, H\) is given by:

$$\begin{aligned} X_{\textrm{H}}=F\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial F}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i}+A^a\frac{\partial }{\partial u^a} . \end{aligned}$$
(35)

Observe that this solution exists all over the manifold \(T^*M\times U\) and that \(p_o\) is constant for every curve solution to the problem. The last term corresponds to the elements of \(\ker \omega \).

The minimality, compatibility, conditions are \(\frac{\partial H}{\partial u^a}=0\) for every a, are used to determine the controls.

As we said in Sect. 3, if the compatibility equations allows us to determine the controls \(u^1,\ldots ,u^k\), that is we can obtain \(u^a= \psi (x^o,x^i,p_o,p_i)\), then we say that the optimal control problem is regular, otherwise it is called singular. In the singular case, it is necessary to apply an algorithm of constraints, that is to go to higher order conditions, to obtain the controls perhaps on a submanifold of \(T^*M\times U\). Suppose that we are in the regular situation, hence we have determined the controls by the compatibility conditions.

With the regularity assumption as the controls \(u^a\) has been determined, we have that \(X_{\textrm{H}}\) is projected to the manifold \(T^*M\) and has components only in \((x^o,x^i,p_o,p_i)\).Then we have:

$$\begin{aligned} X_{\textrm{H}}=F\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial F}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} \end{aligned}$$
(36)

because we are in the symplectic case.

We know that, for all the solutions of the associated presymplectic formulation, we have that the moment \(p_o(t)\) is a constant. Following the previous section, we will try to classify the solutions according to the real value of \(p_o\). Hence we define and study

  1. (a)

    Normal solutions: those with \(p_o=\lambda _o=/0\).

  2. (b)

    Abnormal solutions: those with \(p_o=\lambda _o=0\).

5.2 Normal Solutions: \(p_o=\lambda _o=/0\)

Let \(N\subset T^*M\) be the submanifold defined by \(p_o=\lambda _o\) and \(j:N\hookrightarrow T^*M\) be the natural inclusion. We denote by \((x^o,x^i,p_i)\) the coordinates induced in N by the coordinates we have in \(T^*M\).

Consider now the canonical 1-form \(\theta _M\in \Omega ^1(T^*M)\) and let \( \eta =-(j)^*\theta _M\), then we have that

Lemma 2

\((N,\eta )\) is a contact manifold. The Reeb vector field is \(R=-\frac{1}{\lambda _o}\frac{\partial }{\partial x^o}\).

Let \(H_N=(j)^*H\) the restriction of H to N, then \(H_N= \lambda _o(X)^o+p_i(X)^i\) and we have a Hamiltonian contact system given by \((N,\eta ,H_N)\). Let \(X_N\in \mathfrak {X}(N)\) be the corresponding contact Hamiltonian vector field, that is the solution to the equations

$$\begin{aligned} {\textbf{i}}(X_N)\eta =-H_N,\,\,\,{\textbf{i}}(X_N)\textrm{d}\,\eta =\textrm{d}\, H_n-(L(R)H_N)\eta \, \end{aligned}$$

whose local expression is

$$\begin{aligned} X_N=X^o\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial X^o}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} . \end{aligned}$$
(37)

With the usual notation denoting by the same names the functions on \(T^*M\) and their restrictions to N.

With this in mind and following Sect. 4.2.1, we have that:

Proposition 3

The vector field \(X_{\textrm{H}}\in \mathfrak {X}(T^*M)\) is tangent to N and, on the points of N, it is equal to \(X_N\).

Hence, for every \(u\in U\), all the normal solutions to the optimal control problem are solutions to a contact Hamiltonian problem.

5.3 Abnormal Solutions: \(p_o=\lambda _o=0\)

Let \(N_o\subset T^*M\) the submanifold defined by \(p_o=0\). Let \(j_o:N_o\hookrightarrow T^*M\) be the natural inclusion and \(\eta _o =j_o^*\,\theta _M\).

As above, \(\eta _o=-p_i\textrm{d}\,x^i\) is not a contact form and we have that \(\eta _o\wedge (\textrm{d}\,\eta _o)^{m_o}=0\).

We can consider the 2-form \(\omega _o=\textrm{d}\,\eta _o\), the Hamiltonian \(H_o=j_o^*H\) and the presymplectic manifold \((N_o,\omega _o,H_o)\). Observe that \(\ker \omega _o=\{\frac{\partial }{\partial x^o}\}\). The Hamiltonian presymplectic equation

$$\begin{aligned} \textbf{i}(X_o)\omega _o= \textrm{d}\,H_o \end{aligned}$$

gives the solution

$$\begin{aligned} X_o= X^i\frac{\partial }{\partial x^i}- p_j\frac{\partial X^j}{\partial x^i}\frac{\partial }{\partial p_i}+A^a\,\frac{\partial }{\partial u^a}, \end{aligned}$$

where \(A^a\) are arbitrary and correspond to \(\ker \omega _o\).

And it does not exist any constraint because the vector field \(X_o\) is defined on the whole manifold \(N_o\).

Note: We can also solve the precontact problem given by \((N_o^u,\eta _o^u,H_o^u)\).

Comment: Observe that \(T^*M=\bigcup _{\lambda \in \mathbb {R}}N_{\lambda }\,\), hence with this decomposition we obtain all the solutions of the Hamiltonian problem on \(T^*M\) given by the Hamiltonian H. Some of them, the normal solutions, as contact problems, and the abnormal solutions as symplectic ones.

With all this in mind, we have proved the following

Theorem 7

(Contact Pontryagin maximum principle) Consider the optimal control problem \((M,U,X,I,x_a,x_b)\), with \(U\subset \mathbb {R}^k\) an open set. Let \({\hat{\sigma }}:I\rightarrow T^*M\times U=T^*\mathbb {R}\times T^*M_o\times U\), \({\hat{\sigma }}=(\sigma _{T^*M},\sigma _U)\), be a solution of the presymplectic Pontryagin maximum principle for such problem and suppose we are in the regular case, that is the minimality conditions \((\partial H/\partial u)=0\), for every \(u\in U\) and for every \(t\in I\) allows to determine the controls. Then

  1. (a)

    if \({\hat{\sigma }}\) is a normal solution with \(p_o=\lambda _o=/0\), then s it is an integral curve of the contact Hamiltonian system \((N,\eta ,H_N)\), as described above, with \(H_N=\lambda _o F+p_iX^i\).

  2. (b)

    if \({\hat{\sigma }}\) is an abnormal solution, then it is an integral curve of the presymplectic Hamiltonian system \((N_o,\omega _o,H_o)\), as described above, with \(H_o=p_i\, X^i\) .

For the normal solutions, they satisfy the differential equations:

$$\begin{aligned} {\dot{x}}^o= & {} \frac{\partial H}{\partial p_o}=F,\,\,{\dot{p}}_o=\frac{\partial H}{\partial x^o}=0,\,\,(\Rightarrow p_o=ct) \\ {\dot{x}}^i= & {} \frac{\partial H}{\partial p_i}=X^i,\,\, {\dot{p}}^i=-\frac{\partial H}{\partial x^i}=-p_o\frac{\partial F}{\partial x^i}-p_j\frac{\partial X^j}{\partial x^i} \\ \frac{\partial H}{\partial u^1}= & {} 0,\ldots ,\frac{\partial H}{\partial u^k}=0 \end{aligned}$$

where \(H=\lambda _oF+p_iX^i\) with \(\lambda _o=/0\).

For the abnormal solutions, the corresponding differential equations are

$$\begin{aligned} {\dot{x}}^i= & {} \frac{\partial H}{\partial p_i}=X^i,\,\, {\dot{p}}^i=-\frac{\partial H}{\partial x^i}=-p_j\frac{\partial X^j}{\partial x^i} \\ \frac{\partial H}{\partial u^1}= & {} 0,\ldots ,\frac{\partial H}{\partial u^k}=0 \end{aligned}$$

where \(H=p_iX^i\).

Comment: We note that in Ohsawa (2015), an approach to Pontryagin maximum principle is given in terms of contact systems. Indeed, the author works in the projectivization of the cotangent bundle, \(\mathbb {P} T^* (M_0 \times \mathbb {R})\). In that approach, the normal and abnormal solutions are unified, and the abnormal ones correspond to the hyperplane at infinity. However this manifold is not a contact manifold in the sense we are using, so we are forced to remove the hyperplane at infinity and obtain \(T^* M_0 \times \mathbb {R}\).

This can be seen as a different formulation of Theorem 7 in our paper, where we treat separately both kind of solutions, which correspond to different geometries (contact and presymplectic). In addition, in that reference the author does not study the relationship between the Herglotz variational principle and optimal control, which is the focus of our paper.

6 Herglotz Variational Problem as an Optimal Control Problem

In Sect. 2.3.1 we have studied the Herglotz variational principle; there we obtained the contact equations for a Hamiltonian contact system as solution of a variational problem but with a generalization of the Hamilton variational principle. This more general principle was stated and solved in 1930 by Gustav Herglotz, see Herglotz (1930) and Guenther et al. (1996). The idea was to change the integral statement on the curves solution to the problem by a differential equation defined precisely by the Lagrangian function. Interest in this approach has been increasing since the last referred publication and its relation with contact dynamics and dissipation systems, see for example Georgieva and Guenther (2002) and Bravetti et al. (2017) and references therein. In this section we approach Herglotz principle as an optimal control problem and find the corresponding differential equations, the generalized Euler–Lagrange equations, with a new proof through the Pontryagin maximum principle.

6.1 Statement of the Problem

We begin recalling the statement of the Herglotz variational problem as we did in Sect. 2.3.1.

Let Q be a smooth manifold and \(F:TQ\times \mathbb {R}\rightarrow \mathbb {R}\) a smooth function and consider the following problem:

Herglotz variational problem: Find curves \(\Gamma =(\gamma ,\zeta ):I=[a,b]\rightarrow Q\times \mathbb {R}\), such that

  1. (1)

    end points conditions: \(\gamma (a)=q_a, \gamma (b)=q_b,\zeta (a)=0\),

  2. (2)

    \(\Gamma \) is an integral curve of \({\dot{z}}=F(q,v,z)\): \(\dot{\zeta }=F(\gamma (t),{\dot{\gamma }}(t),\zeta (t))\), for every \(t\in I\), and

  3. (3)

    extreme condition: \(\zeta (b)\) is maximum over all curves satisfying (1) and (2).

Observe that we have considered the differential equation \({\dot{z}}=F(q,v,z)\) depending on the curves \(\gamma \). In the case that the function F does not depend on the variable z, that is \(F:TQ\rightarrow \mathbb {R}\), then the differential equations is \({\dot{z}}=F(\gamma ,{\dot{\gamma }})\), hence by integration, the problem is the classical variational one defined by: find the curves \(\gamma (t)\) minimizing

$$\begin{aligned} S[\gamma ]=\int _a^b F(\gamma (t),{\dot{\gamma }}(t)\,\textrm{d}\, t \end{aligned}$$

with initial conditions \(\gamma (a)=q_a, \gamma (b)=q_b\).

As we know, Herglotz obtained that the curves \(\gamma \) solution to this problem satisfy the so-called generalized Euler–Lagrange equations

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\left( \frac{\partial F}{\partial v^i}\right) _\gamma -\frac{\partial F}{\partial q^i} -\frac{\partial F}{\partial z}\frac{\partial F}{\partial v^i}=0 . \end{aligned}$$

In this section we will obtain these differential equations as an application of the Pontryagin maximum principle to a suitable optimal control problem associated to the Herglotz variational problem.

To do so, we begin by giving a geometric statement of the Herglotz problem. Given the function \(F:TQ\times \mathbb {R}:\rightarrow \mathbb {R}\), consider the right up triangle of the following diagram

where \(Z\in {\mathfrak {X}}(\mathbb {R},\pi _2)\) is the vector field on \(\mathbb {R}\) along the projection \(\pi _2\) defined by

$$\begin{aligned} Z=F \,\frac{\partial }{\partial z} . \end{aligned}$$

Now taking the full diagram, we have the following problem associated with the vector field Z

Geometric Herglotz variational problem: Find curves \(\Gamma :I=[a,b]\rightarrow Q\times \mathbb {R}\), \(\Gamma =(\gamma ,\zeta )\), such that

  1. (1)

    end points conditions: \(\Gamma (a)=(q_a,0) , \gamma (b)=q_b\),

  2. (2)

    \(\Gamma \) is an integral curve of Z: \(\dot{\zeta }=F({\tilde{\Gamma }}(t))\), for every \(t\in I\), where \({\tilde{\Gamma }}=(\gamma '=(\gamma ,{\dot{\gamma }}),\zeta )\), and

  3. (3)

    extreme condition: \(\zeta (b)\) is maximum over all curves satisfying (1) and (2).

Obviously the two above problems are equivalent. The difference is only in the language used to state them.

6.2 Optimal Control Approach to the Herglotz Variational Problem

Associated to the function \(F:TQ\times \mathbb {R}:\rightarrow \mathbb {R}\), consider the following diagram

where Y is the vector field on \(Q\times \mathbb {R}\) along the projection \(\tau _Q\times I_\mathbb {R}:TQ\times \mathbb {R}\rightarrow Q\times \mathbb {R}\) defined by \(Y((q,v),z)=((q,z),v,F)\), which in local coordinates Y is given by

$$\begin{aligned} Y=v^i\frac{\partial }{\partial q^i}+F\frac{\partial }{\partial z} . \end{aligned}$$

This vector field corresponds to the system of ordinary differential equations:

$$\begin{aligned} {\dot{q}}^i=v^i,\qquad {\dot{z}}=F(q^i, v^i,z) . \end{aligned}$$

Observe that the first sumand of the vector field Y is a canonical vector field along the projection \(\tau _Q:TQ\rightarrow Q\), it corresponds to the identity map \(I_{TQ}:TQ\rightarrow TQ\). Hence the vector field Y is associated in a natural way to the function F.

These elements define a control system with vector field \(Y\in {\mathfrak {X}}(Q\times \mathbb {R}, \tau _Q\times I_\mathbb {R})\), on the state space \(Q\times \mathbb {R}\), and with the fibres of TQ as the set of controls; that is for every state \((q,z)\in Q\times \mathbb {R}\), the controls are the elements \(v\in T_q Q\).

On this control system we state the following optimal control problem: Consider the diagram

where, if \(\Gamma =(\gamma ,\zeta )\), then \({\tilde{\Gamma }}=(\gamma ',\zeta )=((\gamma ,{\dot{\gamma }}),\zeta )\).

For a curve \(\Gamma :I\rightarrow Q\times \mathbb {R}\), we take its canonical lifting to the tangent bundle, \(\Gamma ':I\rightarrow T(Q\times \mathbb {R})\), that is: if \(\Gamma =(\gamma ,\zeta )\) then \(\Gamma '= ((\gamma ,\zeta ), ({\dot{\gamma }}, {\dot{\zeta }}))\).

We say that a curve \(\Gamma \) is an integral curve of the vector field Y if:

$$\begin{aligned} \Gamma '=Y\circ {\tilde{\Gamma }},\qquad ({\dot{\gamma }},{\dot{\zeta }})=Y(\gamma ,{\dot{\gamma }},\zeta )=(v^i(\gamma ,{\dot{\gamma }}),F(\gamma ,{\dot{\gamma }},\zeta ))\, , \end{aligned}$$

which, in local coordinates, is a solution to the above system of differential equations:

$$\begin{aligned} {\dot{q}}^i=v^i,\qquad {\dot{z}}=F(q^i, v^i,z) . \end{aligned}$$

Hence we have the optimal control problem given by:

Optimal control problem associated to Herglotz variational problem:

Find curves \(\Gamma :I=[a,b]\rightarrow Q\times \mathbb {R}\), \(\Gamma =(\gamma ,\zeta )\), such that

  1. (1)

    end points conditions: \(\Gamma (a)=(q_a,0) , \gamma (b)=q_b\),

  2. (2)

    \(\Gamma \) is an integral curve of Y: \(\Gamma '(t)=Y({\tilde{\Gamma }}(t))\), for every \(t\in I\) and

  3. (3)

    optimal condition: \(\zeta (b)\) is maximum over all curves satisfying (1) and (2).

Observe that the optimal condition can be stated as:

$$\begin{aligned} \textrm{m}ax\,\, z(b)=\max \int _a^b{\dot{z}}(t)\textrm{d}t=\max \int _a^bF(q(t),{\dot{q}}(t),z(t))\textrm{d}t , \end{aligned}$$

hence we have a classical optimal control theory with F as the cost function.

This optimal control problem, which can be solved using the Pontryagin maximum principle, is equivalent to the above Herglotz variational problem: if \(\Gamma =(\gamma ,\zeta )\) is a solution to the above optimal control problem then \(\gamma \) is a solution to the Herglotz variational problem and \({\dot{\zeta }}=F(\gamma ,{\dot{\gamma }},\zeta )\), and conversely.

We denote this problem by \((M,U,X,I,x_a,x_b)=(Q\times \mathbb {R}, TQ, Y,I,q_a,q_b)\) with the notation described in Sect. 3.

6.3 Application of the Presymplectic Form of the Pontryagin Maximum Principle

According to Sect. 3, first we have to extend the problem and declare the direction where the optimization must be done using the cost function.

6.3.1 The Extended Problem

Observe that in the above optimal control problem, \((Q\times \mathbb {R}, TQ, Y,I,q_a,q_b)\), the cost function is F, that corresponds also to the state variable z, then we need to extend the problem adding a new variable with F as derivative. Denote by \(q^o\in \mathbb {R}\) this new variable. The differential equation to add to the system is \({\dot{q}}^o= F(q,v,z)\). To change to this extended problem we need to consider the diagram

and take the control system given by the dynamical vector field \({\hat{Y}}\in {\mathfrak {X}}(\mathbb {R}\times Q\times \mathbb {R},I_{\mathbb {R}}\times \tau _Q\times I_\mathbb {R})\), which in coordinates reads

$$\begin{aligned} {\hat{Y}}=F\frac{\partial }{\partial q^o}+v^i\frac{\partial }{\partial q^i}+F\frac{\partial }{\partial z} , \end{aligned}$$

with the manifold \(\mathbb {R}\times Q\times \mathbb {R}\) as state space and with controls the fibres of TQ, that is for every state \((q^o,q,z)\in \mathbb {R}\times Q\times \mathbb {R}\), the controls are the elements of \(v\in T_q Q\).

On this system, the precise statement of the optimal control problem we have is:

Extended optimal control formulation of the Herglotz variational problem:

Find curves \({\hat{\Gamma }}:I=[a,b]\rightarrow \mathbb {R}\times Q\times \mathbb {R}\), \(\Gamma =(\gamma ^o,\gamma ,\zeta )\), such that

  1. (1)

    end points conditions: \({\hat{\Gamma }}(a)=(0,q_a,0) , \gamma (b)=q_b\),

  2. (2)

    \({\hat{\Gamma }}\) is an integral curve of \({\hat{Y}}\): \({\hat{\Gamma }}'(t)={\hat{Y}}(\tilde{{\hat{\Gamma }}}(t))\), for every \(t\in I\), where \(\tilde{{\hat{\Gamma }}}:I\rightarrow \mathbb {R}\times TQ\times \mathbb {R}\), \(\tilde{{\hat{\Gamma }}}=(\gamma ^o,(\gamma , {\dot{\gamma }}),\zeta )\), and

  3. (3)

    extreme condition: \(\zeta (b)\) is maximum over all curves satisfying (1) and (2).

This is the optimal control problem denoted by \((\mathbb {R}\times Q\times \mathbb {R}, TQ, {\hat{Y}},I,q_a,q_b)\).

6.3.2 Solution of the Extended Problem with the Presymplectic Form of the Pontryagin Maximum Principle

Following Sect. 3, to solve this optimal control problem consider the projection

$$\begin{aligned} {\hat{\pi }}_1: T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ\rightarrow T^*(\mathbb {R}\times Q\times \mathbb {R}) . \end{aligned}$$

This last manifold has a canonical symplectic form \(\omega _{\mathbb {R}\times Q \times \mathbb {R}} \in \Omega ^2(T^*(\mathbb {R}\times Q\times \mathbb {R}))\), \(\omega _{\mathbb {R}\times Q \times \mathbb {R}}=-\textrm{d}\theta _{\mathbb {R}\times Q \times \mathbb {R}}\), which in canonical coordinates, \((q^o,p_o,q^i,p_i,z,p_z)\), reads

$$\begin{aligned} \omega _{\mathbb {R}\times Q \times \mathbb {R}}= & {} -\textrm{d}\theta _{\mathbb {R}\times Q \times \mathbb {R}}=-\textrm{d}(p_o\textrm{d}q^o+p_i\textrm{d}x^i+p_z\textrm{d}z)\\= & {} \textrm{d}q^o\wedge \textrm{d}p_o+\textrm{d}x^i\wedge \textrm{d}p_i+\textrm{d}z\wedge \textrm{d}p_z . \end{aligned}$$

Let \(\omega ={\hat{\pi }}_1^*\,\omega _{\mathbb {R}\times Q \times \mathbb {R}}\), then \(\omega \) is a presymplectic form in \(T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ\), its kernel being the vector fields tangent to TQ which are vertical vector fields, that is tangent to the fibres of \(\tau _Q:TQ\rightarrow Q\). Hence \(\ker \omega \) is locally generated by

$$\begin{aligned} \frac{\partial }{\partial v^1},\ldots ,\frac{\partial }{\partial v^n} , \end{aligned}$$

if \(\dim Q=n\). The local expressions of \(\omega \) and \(\omega _o\) are the same, with the usual abuse of notation for the local coordinates.

With the vector field \({\hat{Y}}\), as usually, we can built a natural Hamiltonian function given by \(H:T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ\rightarrow \mathbb {R}\), locally given as

$$\begin{aligned} H\big (q^o,p_o,x^i,p_i,z,p_z,v^i\big )=p_oF+p_iv^i+p_z F , \end{aligned}$$

and consider the presymplectic system \((T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ,\omega ,H)\).

The corresponding Hamiltonian vector field \(X_{\textrm{H}}\), satisfying the equation \({\textbf {i}}_{X_{\textrm{H}}}\omega =\textrm{d}H\), is locally given by

$$\begin{aligned} X_{\textrm{H}}= & {} F\frac{\partial }{\partial q^o}+0\frac{\partial }{\partial p_o}+ v^i\frac{\partial }{\partial q^i}+F\frac{\partial }{\partial z} \end{aligned}$$
(38)
$$\begin{aligned}{} & {} -\left( p_o\frac{\partial F}{\partial q^i}+p_z\frac{\partial F}{\partial q^i}\right) \frac{\partial }{\partial p_i}- \left( p_o\frac{\partial F}{\partial z} +p_z\frac{\partial F}{\partial z}\right) \frac{\partial }{\partial p_z}+A^i\frac{\partial }{\partial v^i} , \end{aligned}$$
(39)

where the last term corresponds to the kernel of \(\omega \).

The compatibility conditions for the presymplectic system, or the Pontryagin maximum principle optimality conditions, are given by, see Gotay and Nester (1979) and Muñoz-Lecanda and Román-Roy (1992),

$$\begin{aligned} L_VH=0 \end{aligned}$$

for every \(V\in \ker \omega \), when restricted to the curves \(\sigma =(\sigma ^o, \delta _0,\sigma ^i,\delta _i, \sigma ^z,\delta _z, w^i)\), solution to the system of differential equations

$$\begin{aligned}{} & {} {\dot{q}}^o=F,\qquad {\dot{p}}_o=0 \end{aligned}$$
(40)
$$\begin{aligned}{} & {} {\dot{q}}^i=v^i,\quad {\dot{p}}_i=-(p_o+p_z)\frac{\partial F}{\partial q^i} \end{aligned}$$
(41)
$$\begin{aligned}{} & {} {\dot{z}}=F, \quad {\dot{p}}_z=-(p_o+p_z)\frac{\partial F}{\partial z} \end{aligned}$$
(42)
$$\begin{aligned}{} & {} {\dot{v}}^i=A^i \end{aligned}$$
(43)

where the \(A^i\) are free. These differential equations correspond to the integral curves of the vector field \(X_{\textrm{H}}\).

In local coordinates, the compatibility conditions are \(L_{\frac{\partial }{\partial v^i}}H=0\), for every \(i=1,\ldots , n\). As \(H=(p_o+p_z)F+p_iv^i\), we have:

$$\begin{aligned} L_{\frac{\partial }{\partial v^i}}H=(p_o+p_z)\frac{\partial F}{\partial v^i}+p_i=0 . \end{aligned}$$

In the weak presymplectic Pontryagin maximum principle, these are the conditions from where we can obtain the controls \(v^i\), looking for the critical points of H with respect to the controls.

In the present situation, these functions are constraints defining a submanifold of \(T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ\), and the Hamiltonian vector field solution, \(X_{\textrm{H}}\), have to be tangent to this submanifold, hence:

$$\begin{aligned} L_{X_{\textrm{H}}}\left( (p_o+p_z)\frac{\partial F}{\partial v^i}+p_i\right) =0 , \end{aligned}$$

but

$$\begin{aligned}{} & {} L_{X_{\textrm{H}}}\left( (p_o+p_z)\frac{\partial F}{\partial v^i}+p_i\right) \\{} & {} \quad =(p_o+p_z)\!\left( v^j\frac{\partial ^2 F}{\partial q^j\partial v^i}+F\frac{\partial ^2 F}{\partial z\partial v^i} -\frac{\partial F}{\partial q^j}-\frac{\partial F}{\partial z}\frac{\partial F}{\partial v^i}+A^j\frac{\partial ^2 F}{\partial v^j\partial v^i}\right) , \end{aligned}$$

where \(A^j= {\dot{v}}^j\). Hence we have:

$$\begin{aligned} (p_o+p_z)\left( v^j\frac{\partial ^2 F}{\partial q^j\partial v^i}+F\frac{\partial ^2 F}{\partial z\partial v^i} -\frac{\partial F}{\partial q^i} - \frac{\partial F}{\partial z}\frac{\partial F}{\partial v^i}+{\dot{v}}^j\frac{\partial ^2 F}{\partial v^j\partial v^i}\right) =0 . \end{aligned}$$

Which on the curves solution gives

$$\begin{aligned} L_{X_{\textrm{H}}}\left( (p_o+p_z)\frac{\partial F}{\partial v^i}+p_i\right) =(p_o+p_z)\left( \frac{\textrm{d}}{\textrm{d}t}\frac{\partial F}{\partial v^i}-\frac{\partial F}{\partial q^i} -\frac{\partial F}{\partial z}\frac{\partial F}{\partial v^i}\right) =0 . \end{aligned}$$

These differential equations are a necessary condition, for a curve \(\sigma \) on the manifold \(T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ\), to be solution of the presymplectic system have to satisfy when it is projected to \(Q\times \mathbb {R}\) .

But we have that

Lemma 3

On the solution curves the quantity \(p_o+p_z\) vanishes if and only if it vanishes at the initial point.

Proof

: We know that \({\dot{p}}_z=-(p_o+p_z)\frac{\partial F}{\partial z}\) and \({\dot{p}}_o=0\), hence \(p_o\) is a constant. Then, the differential equation defining \(p_z\) is

$$\begin{aligned} {\dot{p}}_z=-p_z\frac{\partial F}{\partial z}-p_o\frac{\partial F}{\partial z}=-Ap_z-p_oA , \end{aligned}$$

where, on the solution curves, A is a function of t. This is a linear differential equation whose general solution is

$$\begin{aligned} p_z(t)+p_0=(p_z(0)+p_0)\exp \left( -\int \frac{\partial F}{\partial z} \right) \end{aligned}$$

and the proof is finished. \(\square \)

6.4 The Final Results

From the above Lemma we obtain that:

Theorem 8

If \(\sigma =(\sigma ^o, \delta _0,\sigma ^i,\delta _i, \sigma ^z,\delta _z, w^i)\) is a solution to the presymplectic system \((T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ,\omega ,H)\), then its projection to \(Q\times \mathbb {R}\), \((\sigma ^i,\sigma ^z)\), satisfies the equations

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\frac{\partial F}{\partial v^i}-\frac{\partial F}{\partial q^i} -\frac{\partial F}{\partial z}\frac{\partial F}{\partial v^i}=0 \end{aligned}$$

Hence if we include in the statement the original problem, we have proven the following

Theorem 9

If \(\sigma =(\sigma ^o, \delta _0,\sigma ^i,\delta _i, \sigma ^z,\delta _z, w^i)\), \(\sigma :I\rightarrow T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ\), is a solution to the presymplectic system \((T^*(\mathbb {R}\times Q\times \mathbb {R})\times TQ,\omega ,H)\), then

  1. (a)

    its projection to \(\mathbb {R}\times Q\times \mathbb {R}\), \({\hat{\Gamma }}:I=[a,b]\rightarrow \mathbb {R}\times Q\times \mathbb {R}\), \(\Gamma =(\gamma ^o=\sigma ^o,\gamma =(\sigma ^i),\zeta =\delta _z)\), is a solution to the extended optimal control problem \((\mathbb {R}\times Q\times \mathbb {R}, TQ, {\hat{Y}},I,q_a,q_b)\)

  2. (b)

    its projection to \(Q\times \mathbb {R}\), \(\Gamma :I=[a,b]\rightarrow Q\times \mathbb {R}\), \(\Gamma =(\gamma =(\sigma ^i),\zeta =\delta _z)\), is a solution to the optimal control problem \((Q\times \mathbb {R}, TQ, Y,I,q_a,q_b)\).

As we know, this optimal control problem is equivalent to the Herglotz variational problem given by the function \(F:TQ\times \mathbb {R}:\rightarrow \mathbb {R}\), the interval \(I=[a,b]\) and the initial conditions \(q_a,q_b\), then we have proven the

Theorem 10

Given the manifold Q and the function \(F:TQ\times \mathbb {R}:\rightarrow \mathbb {R}\). If \(\Gamma =(\gamma ,\zeta )\) is a solution to the Herglotz variational problem defined by F, then \(\Gamma \) satisfies the differential equations

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\frac{\partial F}{\partial v^i}-\frac{\partial F}{\partial q^i} -\frac{\partial F}{\partial z}\frac{\partial F}{\partial v^i}=0 , \end{aligned}$$

which are known as generalized Euler–Lagrange equations for the Herglotz problem.

See Guenther et al. (1996), Georgieva and Guenther (2002) for comparison between different proofs.

7 Herglotz Optimal Control Problem

In this section we give a generalization of the classical optimal control problem in the same way that Herglotz variational problem is a generalization of Hamilton principle in mechanics.

7.1 Statement of the Problem

As it was described in Sect. 3, a classical optimal control problem is given by the elements \((M,U,X,F,I,x_a,x_b)\). The cost function \(F:M\times U\rightarrow \mathbb {R}\) is used to express the optimization condition as an integral

$$\begin{aligned} S[\gamma (t)=(x(t),u(t))]=\int _a^b F(x(t), u(t))\textrm{d}\,t , \end{aligned}$$

which is a functional on the curves \(\gamma :I\rightarrow M\times U\), satisfying some initial conditions and being integral curves of the vector field X, that is \({\dot{x}}=X(x(t),u(t))\).

This is “similar” to the classical variational calculus with F in the role of the Lagrangian and the integrability condition for the curves as a constraint.

But the generalization proposed and studied by Herglotz changes the integral functional as the element to optimize by a differential equation satisfied by a new variable, denoted by z, differential equation just defined by the cost function F, that is \({\dot{z}}=F(x,u,z)\), instead of the above integral; see Sects. 2.3.1 or 6 for more details.

Now we propose a generalization of the classical optimal control problem following the ideas of Herglotz.

Remembering the elements giving us an optimal control problem, we have the diagram

that is \(X\in {\mathfrak {X}}(M,\pi )\), X(xzu), and a cost function, F(xzu), to integrate on the curves solution to the differential equation given by X. Instead of this cost function, we take a function \(F:M\times \mathbb {R}\times U\rightarrow \mathbb {R}\), depending also on a new variable z, and consider the following problem:

Herglotz optimal control problem:

Find curves \(\gamma :I=[a,b]\rightarrow M\times \mathbb {R}\times U\), \(\gamma =(\gamma _M,\gamma _z,\gamma _U)\), such that

  1. (1)

    end points conditions: \(\gamma _M(a)=x_a, \gamma _M(b)=x_b, \gamma _z(a)=0\),

  2. (2)

    \(\gamma _M\) is an integral curve of X: \(\dot{\gamma }_M=X\circ (\gamma _M,\gamma _z,\gamma _U)\),

  3. (3)

    \(\gamma _z\) satisfies the differential equation \({\dot{z}}=F(x,z,u)\), and

  4. (4)

    maximal condition: \(\gamma _z(b)\) is maximum over all curves satisfying (1), (2) and (3).

The differential equations corresponding to this problem are

$$\begin{aligned} {\dot{x}}^i=X^i(x,z,u),\qquad {\dot{z}}=F(x,z,u) . \end{aligned}$$
(44)

If the function F does not depend on z, then the maximal condition takes the form

$$\begin{aligned} (4')\,\, z(b)= \int _a^b F(x,u)\textrm{d}t\quad \mathrm { is\,\, maximum} \end{aligned}$$

which gives a classical optimal control problem. Hence we have a generalization of the classical problem in the sense of Herglotz.

In order to solve this problem we begin by transforming it into a classical optimal control problem.

7.2 Solution to Herglotz Optimal Control Problem

There is another way to organize all these elements, \((M,U,X,F,x_a,x_b)\), in a shorter form. Consider the following diagram

where \(Z=X+Y\), that is \(Z=X^i\frac{\partial }{\partial x^i}+F\frac{\partial }{\partial z}\), locally. And the curves are \(\Gamma =(\gamma _M,\gamma _z)\), \(\gamma = (\Gamma ,\gamma _U)= (\gamma _M,\gamma _z,\gamma _U)\).

Then we have another equivalent statement:

Herglotz optimal control problem: Find curves \(\gamma :I=[a,b]\rightarrow M\times \mathbb {R}\times U\), \(\gamma =(\gamma _M,\gamma _z,\gamma _U)\), \(\Gamma =(\gamma _m,\gamma _z)\) such that

  1. (1)

    end points conditions: \(\gamma _M(a)=x_a, \gamma _M(b)=x_b, \gamma _z(a)=0\),

  2. (2)

    \(\Gamma _M\) is an integral curve of Z: \(\dot{\Gamma }_M=Z\circ \gamma \), and

  3. (3)

    optimal condition: \(\gamma _z(b)\) is maximum over all curves satisfying (1), and (2).

Condition (2) is written as \(({\dot{\gamma }}_,{\dot{\gamma }}_z)=Z\circ \gamma \), that is

$$\begin{aligned} {\dot{x}}^i=X^i(x,z,u),\qquad {\dot{z}}=F(x,z,u) . \end{aligned}$$
(45)

which are the same set of differential equations as Eq. (44). Hence both problems are equivalent. In the sequel we refer to this second form.

Observe that with this approach, we have a classical optimal control problem and we can find its solution following the method of Sect. 3, in particular by applying the weak presymplectic form of the Pontryagin maximum principle, Theorem 5. In this case, the function to optimize is one of the directions of state space which is given by z.

We begin, as usual, by extending the vector field, hence obtaining the extended system adding a new variable \(x^o\) for the variable z to maximize. The new vector field is

$$\begin{aligned} \overline{X}= F\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}+F\frac{\partial }{\partial z}\in {\mathfrak {X}}(\mathbb {R}\times M\times \mathbb {R}) . \end{aligned}$$

Then the associated Hamiltonian is \(H(x^o,p_o,x^i,p_i,z,p_z,u)=p_oF+p_iX^i+p_z F\), defined on the manifold \(T^*(\mathbb {R}\times M\times \mathbb {R})\times U\). The presymplectic form is \(\omega =\textrm{d}x^o\wedge \textrm{d}p_o+\textrm{d}x^i\wedge \textrm{d}p_i+\textrm{d}z\wedge \textrm{d}p_z\), with kernel given by the tangent vector fields to U, and the Hamiltonian vector field \(X_{\textrm{H}}\), solution to the equation \({\textbf {i}}_{X_{\textrm{H}}}\omega =\textrm{d}H\), is locally given by

$$\begin{aligned} X_{\textrm{H}}= & {} F\frac{\partial }{\partial x^o}+0\frac{\partial }{\partial p_o}+ X^i\frac{\partial }{\partial x^i}+F\frac{\partial }{\partial z} \\{} & {} -\left( p_o\frac{\partial F}{\partial x^i}+p_j\frac{\partial X^j}{\partial x^i}+p_z\frac{\partial F}{\partial x^i}\right) \frac{\partial }{\partial p_i}- \left( p_o\frac{\partial F}{\partial z} + p_i \frac{\partial X^i}{\partial z} +p_z\frac{\partial F}{\partial z}\right) \frac{\partial }{\partial p_z}\\{} & {} +A^a\frac{\partial }{\partial u^a} , \end{aligned}$$

where the last term corresponds to the kernel of \(\omega \).

Observe that this solution exists all over the manifold \(T^*(\mathbb {R}\times M\times \mathbb {R})\times U\) and that \(p_o\) is constant for every curve solution to the problem.

Being a presymplectic system, the compatibility equations are given by \(\textbf{i}(Z)\textrm{d}\,H=0\) for every \(Z\in \ker \omega \), that is equations

$$\begin{aligned} \frac{\partial H}{\partial u^1}=0,\ldots ,\frac{\partial H}{\partial u^k}=0 \end{aligned}$$
(46)

which, together with the equations coming from the vector field \(X_{\textrm{H}}\), give us a set of equations to solve the optimal control problem. Recall that these compatibility conditions are the same that the optimality ones.

As in ordinary optimal control problems, suppose that the compatibility equations allow us to determine the controls \(u^1,\ldots ,u^k\), that is we can obtain \(u^a=\psi ^{a}(x^o,x^i,p_o,p_i)\), then we say that the optimal control problem is regular, otherwise it is called singular. In the singular case, it is necessary to apply an algorithm of constraints, that is to go to higher order conditions, to obtain the controls perhaps on a submanifold of \(T^*(\mathbb {R}\times M\times \mathbb {R})\times U\).

The differential equations associated with the above vector field \(X_{\textrm{H}}\), together with equations (46) are the solution equations to the Herglotz optimal control problem.

Remark 5

To understand the significance of these equations, we can compare the above set of equations with the corresponding ones for a classical optimal control system. Apart from the compatibility conditions, which are the same, the vector field solution, see Theorem 5, was given by

$$\begin{aligned} X_N=F\frac{\partial }{\partial x^o}+X^i\frac{\partial }{\partial x^i}- \left( \lambda _o\frac{\partial F}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}\right) \frac{\partial }{\partial p_i} . \end{aligned}$$
(47)

Comparing this vector field \(X_N\) with the above \(X_{\textrm{H}}\), in this last we have a new variable, z, hence two new terms, one for \({\dot{z}}\) and the other for \({\dot{p}}_z\). Moreover, the term corresponding to \(p_i\) has changed.

But if the cost function F does not depend on z, then we have that \({\dot{p}}_z=0\), hence \(p_z=\textrm{constant}\), and both equations, the classical and the Herglotz optimal control, are the same. In fact in this last case, we can change the differential equation \({\dot{z}}=F(x,z,u)\) and the optimality condition by the integral to be optimized

$$\begin{aligned} \int _a^b F(x,u)\textrm{d}t \end{aligned}$$

and we obtain exactly the classical problem.

Hence, as we proposed at the beginning of the section, we actually have a generalization of the classical optimal control problem from the point of view of the equations solving the problem.

7.3 Contact Formulation for the Normal Solutions

We can analyze the set of normal solutions, that is \(p_o=/0\), in the aim of Sect. 4.2.2 and obtain these solutions as integral curves of contact dynamical systems.

To proceed suppose we are in the regular situation, that is the maximality conditions allows us to determine the controls. To study this situation we can fix the controls, they are determined by the last equations solution to the problem, and analyze the other equations as solutions of a symplectic problem. Then, once fixed \(u=u_o\), our manifold is \(T^*(\mathbb {R}\times M\times \mathbb {R})\). In this manifold we can analyze the problem as a contact dynamical system.

For a given \(\lambda _o\in \mathbb {R}\), \(\lambda _o=/0\), consider the submanifold \(N_{\lambda _o}\subset T^*(\mathbb {R}\times M\times \mathbb {R})\), given by \(p_o=\lambda _o\) and the natural injection \(j_{\lambda _o}:N_{\lambda _o} \hookrightarrow T^*(\mathbb {R}\times M\times \mathbb {R})\). Let \(\eta =-j_{\lambda _o}^*\theta \in \Omega ^1(N_{\lambda _o})\). Then we have

Lemma 4

For every fixed \(u\in U\), the manifold \((N_{\lambda _o}, \eta )\) is a contact manifold. Its Reeb vector field is given by

$$\begin{aligned} R_{\lambda _o}=-\frac{ 1}{ \lambda _o}\frac{\partial }{\partial x^o} \end{aligned}$$

The proof is straightforward using the local expression of \(\eta \)

$$\begin{aligned} \eta = -\lambda _o\textrm{d}x^o-p_i\textrm{d}x^i-p_z\textrm{d}z . \end{aligned}$$

Let \(H_{N_{\lambda _o}}=j_{\lambda _o}^*H\) and consider the Hamiltonian contact system given by \((N_{\lambda _o},\eta , H_{N_{\lambda _o}})\). Let \(Z\in {\mathfrak {X}}(N_{\lambda _o})\) the corresponding Hamiltonian vector field, that is the solution to the contact equations

$$\begin{aligned} {\textbf{i}}(Z)\eta =-H_{N_{\lambda _o}},\,\,\,{\textbf{i}}(Z)\textrm{d}\,\eta =\textrm{d}\, H_{N_{\lambda _o}}-(L(R_{\lambda _o})H_{N_{\lambda _o}})\eta \, \end{aligned}$$

whose local expression is

$$\begin{aligned} X_{\textrm{H}}= & {} F\frac{\partial }{\partial x^o}+0\frac{\partial }{\partial p_o}+ X^i\frac{\partial }{\partial x^i}+F\frac{\partial }{\partial z}\\{} & {} -\left( p_o\frac{\partial F}{\partial x^i}+ p_j\frac{\partial X^j}{\partial x^i}+ p_z\frac{\partial F}{\partial x^i}\right) \frac{\partial }{\partial p_i}- \left( p_o\frac{\partial F}{\partial z} + p_i \frac{\partial X^i}{\partial z} +p_z\frac{\partial F}{\partial z}\right) \frac{\partial }{\partial p_z} . \end{aligned}$$

With the above expressions and comments we have proven the

Theorem 11

The normal solutions to the problem 7.2 corresponding to \(p_o=\lambda _o=/0\) are the projections to \(\mathbb {R}\times M\times \mathbb {R}\times U\) of the curves solution to the contact Hamiltonian problem given by \((N_{\lambda _o},\eta , H_{N_{\lambda _o}})\).

The corresponding differential equations for the curves solution to this Hamiltonian contact problem are :

$$\begin{aligned} {\dot{x}}^o= & {} F \\ {\dot{x}}^i= & {} X^i\\ {\dot{z}}= & {} F\\ {\dot{p}}_i= & {} -p_o\frac{\partial F}{\partial x^i}- p_j\frac{\partial X^j}{\partial x^i}-p_z\frac{\partial F}{\partial x^i}\\ {\dot{p}}_z= & {} - p_o\frac{\partial F}{\partial z}- p_i \frac{\partial X^i}{\partial z} -p_z\frac{\partial F}{\partial z} \end{aligned}$$

Together with the maximization condition, that is the constraints obtained from the compatibility of the presymplectic equation

$$\begin{aligned} \frac{\partial H}{\partial u^1} = 0,\ldots \ldots ,\frac{\partial H}{\partial u^k}=0 \end{aligned}$$

7.4 Reduction of the Problem

We remark that this problem is a generalization of Herglotz variational principle. On the previous section, we showed that the equations obtained through the Pontryagin maximum principle could be reduced to obtain the Herglotz equation. In this section, we show that a similar reduction can be applied in this more general case.

We see from the differential equations above that, taking the same initial condition for both variables, we will have \(x^o = z\) for the solutions of problem 7.2. Then one of them is irrelevant to the problem, we can eliminate it. As the momentum corresponding to \(x^o\) is constant, we can eliminate the pair \((x^o, p_o)\). Observe that, in fact, \(p_z\) is also irrelevant to the problem. Indeed, we can reduce the dimension of the state space of the problem; this new manifold is what we will now construct. Consider the Hamiltonian

$$\begin{aligned} \begin{aligned} H_0 : W_0 = T^*M \times \mathbb {R}\times U&\rightarrow \mathbb {R}, \\ (x^i,p_i,z, u^a)&\mapsto p_i X^i(x^i,z, u^a) - \textrm{pr}_1^* F(x,z,u), \end{aligned} \end{aligned}$$
(48)

where \(\textrm{pr}_1:T^*M \times \mathbb {R}\times U \rightarrow M \times \mathbb {R}\times U\) is the natural projection. Also consider the canonical contact form on \(T^*M \times \mathbb {R}\)

$$\begin{aligned} \eta _0 = \textrm{d}z - p_i \textrm{d}x^i. \end{aligned}$$
(49)

Theorem 12

The normal solutions to the problem 7.2 corresponding to \(p_o=\lambda _o\) are the projections to \(\mathbb {R}\times M\) of the curves solution to the contact Hamiltonian problem given by \((T^*M \times \mathbb {R}, \eta _0, H_0)\).

The equations of motion of the aforementioned Hamiltonian problem are

$$\begin{aligned} {\dot{x}}^i&= X^i, \end{aligned}$$
(50a)
$$\begin{aligned} {\dot{p}}_i&= p_i \frac{\partial F}{\partial z} - p_j \frac{\partial X^j}{\partial x^i} + \frac{\partial F}{\partial x^i} - \frac{\partial X^j}{\partial z} p_i p_j, \end{aligned}$$
(50b)
$$\begin{aligned} {\dot{z}}&= F \end{aligned}$$
(50c)

subjected to the constraints

$$\begin{aligned} \frac{\partial H}{\partial u^a} = \frac{\partial F}{\partial u^a} - p_j \frac{\partial X^j}{\partial u^a} = 0. \end{aligned}$$
(50d)

Remark 6

In the case that the problem is singular, one would work instead with the precontact system \((T^*M \times \mathbb {R}\times U, \eta _0, H_0)\), applying the appropriate constraint algorithm.

Proof

Let \(\gamma \) be a solution of the Herglotz optimal control problem. By Theorem 11, we know that there exists a solution curve \(\sigma \) of the corresponding contact system on \(N_{\lambda _0}\). In order to prove this theorem, we will project \(\sigma \) onto a solution of the system \((T^*M \times \mathbb {R}, \eta _0, H_0)\).

First of all, notice that the solutions satisfy \(x_0 = z\), hence \(\sigma \) will lie on the submanifold \(j:{\tilde{N}}_{\lambda _0} \rightarrow N_{\lambda _0}\) defined by \(x_0 = z\).

The dynamical vector field X of the precontact system \(({N_{\lambda _0}} , {\eta _{\lambda _0}} , {H_{\lambda _0}} )\) is tangent to the submanifold \(\tilde{N}_{\lambda _0}\). Indeed, the restriction of X to \({\tilde{N}}_{\lambda _0}\) are just the equations of motion \({\tilde{X}}\) of the induced precontact system \(({\tilde{N}}_{\lambda _0}, \tilde{\eta }_{\lambda _0} = j^* \eta _{\lambda _0}, {\tilde{H}}_{\lambda _0} = j^* H_{\lambda _0})\). In coordinates

$$\begin{aligned} \tilde{\eta }_{\lambda _0}&= (- \lambda _0 - p_z) \textrm{d}z - p_i \textrm{d}x^i, \end{aligned}$$
(51a)
$$\begin{aligned} {\tilde{H}}_{\lambda _0}&= (\lambda _0 + p_z) F + p_i X^i \end{aligned}$$
(51b)

Consider the following commutative diagram,

figure a

where

$$\begin{aligned} \Phi _{\lambda _0}\big (x^i,z,p_i,p_z\big ) = \big (x^i, -(\lambda _o+p_z)p_i, z\big ). \end{aligned}$$
(53)

Notice that \(\Phi _{\lambda _0}\) is a submersion and a conformal equivalence of precontact systems:

$$\begin{aligned} \Phi _{\lambda _0}^* \eta _0&= -(\lambda _o+p_z) \tilde{\eta }_{\lambda _0}, \end{aligned}$$
(54a)
$$\begin{aligned} \Phi _{\lambda _0}^* H_0&= -(\lambda _o+p_z) {\tilde{H}}_{\lambda _0}, \end{aligned}$$
(54b)

By Theorem 1 projections of the solution curves of the precontact system on \({\tilde{N}}_{\lambda _0}\) are solution curves to the contact system on \(TM\times \mathbb {R}\). \(\square \)

As a consequence of this theorem, we can obtain again the Herglotz equations. Consider the Herglotz problem in Sect. 6.1 for a Lagrangian \(L:TQ \times \mathbb {R}\rightarrow \mathbb {R}\). Notice that this problem is a particular case of the Herglotz optimal control problem, where

  • Controls are the velocities \(u^a = v^i\).

  • The cost function is the Lagrangian \(F=L\).

  • The control equation is \(X =v^i \frac{\partial }{\partial x^i}\).

The solutions to this problem are given by Theorem 12:

$$\begin{aligned} {\dot{q}}^i&= v^i, \end{aligned}$$
(55)
$$\begin{aligned} {\dot{p}}_i&= p_i \frac{\partial L}{\partial z} + \frac{\partial L}{\partial q^i} \end{aligned}$$
(56)
$$\begin{aligned} {\dot{z}}&= L \end{aligned}$$
(57)

with the constraints

$$\begin{aligned} \frac{\partial L}{\partial v^i}&= p_i, \end{aligned}$$
(58)

which are precisely Herglotz equations.

8 Application: Optimal Control on Thermodynamic Systems

One possible application of this theory is the study of thermodynamic processes which minimize or maximize some thermodynamic potential. As an example, we apply our formalism to the control systems considered in Van der Schaft and Maschke (2017).

The relation between symplectic and contact manifolds via the symplectification procedure has permitted to go deeper in the geometric description of thermodynamic systems. This way has been explored in Balian and Valentin (2001) (see also Arnold 1978; Libermann and Marle 1987; Ibáñez et al. 1997).

8.1 Homogeneous Hamiltonian Systems and Contact Systems

There is a close relationship between homogeneous symplectic and contact systems, see for example Van der Schaft and Maschke (2017) where this relation is studied. Here we briefly recall the ideas we need to follow the example.

In the general case, if \(\pi :M\rightarrow B\) is a vector bundle, a function \(F:M\rightarrow \mathbb {R}\) is homogeneous if, for any \(e_p\in M_p=\pi ^{-1}(p)\) with \(\pi (e_p)=p\in B\), we have \(F(\lambda e_p)=\lambda F(e_p)\). In this situation the function F can be projected to the projective bundle \(\mathcal {P}(M)\) over B obtained by projectivization on every fibre. We are interested in the case that \(M=T^*(Q\times \mathbb {R})\rightarrow Q\times \mathbb {R}\), with natural coordinates \((q^i,z,P_i,P_z)\)

Let H be an homogeneous Hamiltonian function on \(T^*( Q \times \mathbb {R})\). Locally, we have that \(H(q^i,z, \lambda P_i, \lambda P_z) = \lambda H(q^i,z,P_i,P_z)\), for all \(\lambda \in \mathbb {R}\). Equivalently, one can write

$$\begin{aligned} H\big (q^i,z,P_i,P_z\big ) = -P_z\, h\big (q^i,-P_i/P_z, z\big ), \end{aligned}$$
(59)

for \(P_z=/0\), where \(h: T^*Q \times \mathbb {R}\rightarrow \mathbb {R}\), \(h(q^i,p_i,z)=H(q^i,z,-p_i,-1)\) is well defined.

With the above changes, we have identified the manifold \(T^*Q \times \mathbb {R}\) as the projective bundle \(\mathcal {P}(T^* (Q \times \mathbb {R}))\) of the cotangent bundle \(T^*(Q \times \mathbb {R})\) taking out the points at infinity, that is the subset defined by \(\{P_z = 0\}\).

Following Van der Schaft and Maschke (2017, Section 4.1), the map

$$\begin{aligned} \begin{aligned} \Phi : T^*( Q \times \mathbb {R}) \setminus \{p_z = 0 \}&\rightarrow T^*Q \times \mathbb {R}\\ (q^i,z,P_i,P_z)&\rightarrow (q^i,P_i/P_z, z) = (q^i, p_i ,z), \end{aligned} \end{aligned}$$
(60)

sends the Hamiltonian symplectic system \((T^*( Q \times \mathbb {R})\setminus \{p_z = 0 \} , \omega _{Q \times \mathbb {R}}, H)\) onto the Hamiltonian contact system \((T^*Q \times \mathbb {R}, \eta _{Q}, h)\), where \(\omega _{Q \times \mathbb {R}} = \textrm{d}q^i \wedge \textrm{d}P_i + \textrm{d}z \wedge \textrm{d}P_z\) and \(\eta _Q = \textrm{d}z - p_i \textrm{d}q^i\). Observe that the natural coordinates of \(T^*Q\times \mathbb {R}\), denoted by \((q^i,p_i,z)\), correspond to the homogeneous coordinates in the projective bundle.

In fact, the map \(\Phi \) is the projectivization; i.e., the map that sends each point in the fibers of \(T^* (Q \times \mathbb {R})\) to the line that passes through it and the origin.

It can be shown that \(\Phi \) provides a bijection between conformal contactomorphisms and homogeneous symplectomorphisms. Moreover, \(\Phi \) maps homogeneous Lagrangian submanifolds \(\mathcal {L} \subseteq T^*( Q \times \mathbb {R})\) onto Legendrian submanifolds \(\mathbb {L} = \phi (\mathcal {L}) \subseteq T^*Q \times \mathbb {R}\). See Van der Schaft and Maschke (2017) and Sect. 8.3 for more details on this topics.

8.2 Control of Contact Systems

On the contact natural manifold \(T^* Q\times \mathbb {R}\), with coordinates \((q^i,p_i,z)\), assume that we are given a parametrized family of Hamiltonians \(h: T^* Q\times \mathbb {R}\times U \rightarrow \mathbb {R}\), \(U\subset \mathbb {R}^k\), with Hamiltonian contact vector fields \(X_{h_u}\), where \(h_u(q^i,z,p_i)=h(q^i,z,p_i,u)\). Then we can define the control system \(Z(q,p,z,u) = X_{h_u}(q,p,z)\), where the following diagram is commutative:

A curve \(\gamma :I\rightarrow T^*Q\times \mathbb {R}\times U\) is an integral curve of Z, that is \(\Gamma '=Z\circ \gamma \), if in local coordinates satisfies the differential equations

$$\begin{aligned} \frac{\textrm{d}q^i}{\textrm{d}t}&= \frac{\partial h_u}{\partial p_i}, \\ \frac{\textrm{d}p_i}{\textrm{d}t}&= - \frac{\partial h_u}{\partial q^i} - p_i \frac{\partial h_u}{\partial z},\\ \frac{\textrm{d}z}{\textrm{d}t}&= p_i \frac{\partial h_u}{\partial p_i} - h_u. \end{aligned}$$

One can consider the Herglotz optimal control problem given by Z, as we stated in Sect. 7.2. Then, by Theorem 12, we know that the normal solutions are the projections of the solutions to the contact system \((T^*(T^*Q) \times \mathbb {R}, \eta _{T^*Q }, H)\), where

$$\begin{aligned} H = p_{q^i}\frac{\partial h_u}{\partial p_i} - p_{p_i}\frac{\partial h_u}{\partial q^i} - p_i \frac{\partial h_u}{\partial z} - p_i \frac{\partial h_u}{\partial p_i} + h_u. \end{aligned}$$
(61)

8.3 Application to Thermodynamic Systems

We consider thermodynamic systems in the so-called entropy representation. Hence the thermodynamic phase space, representing the extensive variables, is the manifold \(T^* Q \times \mathbb {R}\), equipped with its canonical contact form

$$\begin{aligned} \eta _Q = \textrm{d}S - p^i \textrm{d}q^i. \end{aligned}$$
(62)

The local coordinates on the configuration manifold \(Q \times \mathbb {R}\) are \((q^i,S)\), where S is the total entropy and \(q^i\)’s denote the rest of extensive variables. Other variables, such as the internal energy, may be chosen instead of the entropy, by means of a Legendre transformation.

The state of a thermodynamic system always lies on the equilibrium submanifold \(\mathbb {L}\subseteq T^* Q \times \mathbb {R}\), which is a Legendrian submanifold, that is, \(\eta \vert _{T\mathbb {L}} = 0\) and \(\dim \mathbb {L}=\dim Q=n\). The pair \((T^* Q \times \mathbb {R}, \mathbb {L})\) is a thermodynamic system. The equations (locally) defining \(\mathbb {L}\) are called the state equations of the system.

On a thermodynamic system \((T^* Q \times \mathbb {R}, \mathbb {L})\), one can consider the dynamics generated by a Hamiltonian vector field \(X_{\textrm{H}}\) associated to a Hamiltonian h. If this dynamics represents quasistatic processes, meaning that at every time the system is in equilibrium, that is, its evolution states remain in the submanifold \(\mathbb {L}\), it is required for the contact Hamiltonian vector field \(X_h\) to be tangent to \(\mathbb {L}\). This happens if and only if h vanishes on \(\mathbb {L}\).

Equivalently, by Sect. 8.1, one can consider the extended thermodynamic phase space \(T^* (Q \times \mathbb {R})\) with its canonical symplectic form

$$\begin{aligned} \omega _{Q\times \mathbb {R}} = \textrm{d}q^i \wedge \textrm{d}P_i + \textrm{d}S \wedge \textrm{d}P_S. \end{aligned}$$
(63)

In this formulation, a thermodynamic system is a tuple \((T^* (Q \times \mathbb {R}), \mathcal {L}))\), where \(\mathcal {L}\) is a homogeneous Lagrangian submanifold. Dynamics are given by a homogeneous Hamiltonian K. See Van der Schaft and Maschke (2017) for details and recall we have identified, in Sect. 8.1, the bundle \(T^*Q\times \mathbb {R}\) with the projective bundle \(\mathcal {P}(T^* (Q \times \mathbb {R}))\).

Port-thermodynamic systems were introduced in Van der Schaft and Maschke (2017), but in a homogeneous symplectic formalism.

Definition 1

(Port-thermodynamic system) A port-thermodynamic system on \(T^*(Q \times \mathbb {R})\) is defined as a pair \((\mathcal {L},K)\), where the homogeneous Lagrangian submanifold \(\mathcal {L} \subset T^*(Q\times \mathbb {R})\) specifies the state properties. The dynamics is given by the homogeneous Hamiltonian dynamics with parametrized homogeneous Hamiltonian \(K:= K^a + K^c_{\alpha } u^{\alpha } : T^*(Q\times \mathbb {R}) \rightarrow \mathbb {R}, \, u \in \mathbb {R}^k\), \(K^c: T^*(Q\times \mathbb {R}) \rightarrow \mathbb {R}^k\), with \(K^a\), \(K^c\) both equal to zero on the points of \(\mathcal {L}\), and \(K^a\) as the internal Hamiltonian. One need the additional condition

$$\begin{aligned} \frac{\partial K}{\partial S} |_{\mathcal {L} }\ge 0, \end{aligned}$$
(64)

so that the second law of thermodynamics holds.

Using the results of Sect. 8.1, we could instead consider the following contact formulation.

Definition 2

(Port-thermodynamic system, contact formalism) A port-thermodynamic system on \((T^*Q \times \mathbb {R}, \eta _Q)\) is defined as a pair \((\mathbb {L},h)\), where the Legendrian submanifold \({\mathbb {L}} \subset T^*Q \times \mathbb {R}\) specifies the state properties. The dynamics is given by the contact Hamiltonian dynamics with parametrized contact Hamiltonian \(h = h^a + h^c_{\alpha } u^{\alpha } : T^*Q \times \mathbb {R}\rightarrow \mathbb {R}, \, u \in \mathbb {R}^m\), \(h^c: T^*Q\times \mathbb {R}\rightarrow \mathbb {R}^k\), with \(h^a,h^c\) zero on \(\mathbb {L}\), and the internal Hamiltonian \(h^a\) satisfying

$$\begin{aligned} \frac{\partial h}{\partial S} |_{\mathbb {L} }\ge 0, \end{aligned}$$
(65)

so that the second law of thermodynamics holds.

Our theory provides tools to understand which of the available thermodynamic processes minimize the entropy production of the system. Observe that we can consider processes that maximize or minimize other thermodynamic variables, such as the energy, via a Legendre transform.

8.4 Example: Gas–Piston–Damper System

We end this section with an explicit example which can be found in Van der Schaft and Maschke (2017).

Consider an adiabatically isolated cylinder closed by a piston containing a gas with internal energy U(VS).

The extended phase space has the following extensive variables

  • the momentum of the piston \(\pi \),

  • the volume of the gas V,

  • the energy E,

  • the entropy S.

They correspond to \(Q\times \mathbb {R}\) with local coordinates \((V,\pi ,E,S)\). The Legendrian submanifold is given by

$$\begin{aligned} \mathbb {L}{} & {} = \left\{ (V,\pi ,E,p_V,p_{\pi },p_E,S) | E= \frac{\pi ^2}{2m} + U(S,V), p_V = -p_E \frac{\partial {U}}{\partial {V}} ,\right. \nonumber \\{} & {} \left. \quad p_{\pi }= -p_E \frac{\pi }{m}, p_E = 1/\frac{\partial U}{\partial S} \right\} \end{aligned}$$
(66)

The energy is then given by

$$\begin{aligned} h = p_V\frac{\pi }{m} +p_{\pi }\left( -\frac{\partial U}{\partial V} -d\frac{\pi }{m}\right) - \frac{d (\frac{\pi }{m})^2}{\frac{\partial U}{\partial S}} + \left( p_{\pi } + p_E \frac{\pi }{m}\right) u, \end{aligned}$$
(67)

where d is the diameter of the piston and m is its mass.

The Hamiltonian vector field is given by

$$\begin{aligned} X_h= & {} \, \frac{{\pi }}{m} \frac{\partial }{\partial V } + \left( -\frac{{\pi } d}{m} + u - \frac{\partial \,U}{\partial V} \right) \frac{\partial }{\partial {\pi } } + \frac{{\pi } u}{m} \frac{\partial }{\partial E }\nonumber \\{} & {} + \left( {\left( {p_\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\right) } {p_V} + {p_\pi } \frac{\partial ^2\,U}{\partial V ^ 2} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial V\partial S}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \right) \frac{\partial }{\partial {p_V} }\nonumber \\{} & {} + \left( {\left( {p_\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\right) } {p_\pi } + \frac{d {p_\pi }}{m} - \frac{{p_E} u}{m} - \frac{{p_V}}{m} + \frac{2 \, {\pi } d}{m^{2} \frac{\partial U}{\partial S}} \right) \frac{\partial }{\partial {p_\pi } }\nonumber \\{} & {} + {\left( {p_\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\right) } {p_E} \frac{\partial }{\partial {p_E} } \nonumber \\{} & {} + \left( \frac{{\pi }^{2} d}{m^{2} \frac{\partial U}{\partial S}} \right) \frac{\partial }{\partial S } \end{aligned}$$
(68)

We construct the contact Hamiltonian system \((T^*(T^*Q)\times \mathbb {R}, \eta _{T^*Q}, H)\) as in (61):

$$\begin{aligned} \begin{aligned} H =&\, -{\left( \frac{d {p_\pi }}{m} - \frac{{p_E} u}{m} - \frac{{p_V}}{m} + \frac{2 \, {\pi } d}{m^{2} \frac{\partial U}{\partial S}}\right) } {P_\pi } \\&- {\left( {p_\pi } \frac{\partial ^{2}}{(\partial V)^{2}}U\left( V, S\right) - \frac{{\pi }^{2} d \frac{\partial ^{2}}{\partial V\partial S}U\left( V, S\right) }{m^{2} \frac{\partial U}{\partial S}^{2}}\right) } {P_V}\\&- {\left( \frac{{\pi } d}{m} - u + \frac{\partial }{\partial V}U\left( V, S\right) \right) } {P_{p_\pi }} + \frac{{\pi } {P_{p_E}} u}{m} + \frac{{\pi } {P_{p_V}}}{m} - \frac{{\pi }^{2} d}{m^{2} \frac{\partial U}{\partial S}}, \end{aligned} \end{aligned}$$
(69)

where we denote by \(q^i, p_{q^i}, \Pi _{q^i}, \Pi _{p_{q^i}}\) the natural coordinates on \(T^*T^*Q\), where \(q^i\) runs through \(V,\pi ,E\), and \(\Pi _{q^i}, \Pi _{p_{q^i}}\) are the corresponding moments to \(q^i,p_i\) respectively.

The solutions to the control problem are then the integral curves of the Hamiltonian vector field of this system, which are the following

$$\begin{aligned} {{\dot{V}}}= & {} \frac{{\pi }}{m} \\ {\dot{ {\pi }}}= & {} -\frac{{\pi } d}{m} + u - \frac{\partial \,U}{\partial V} \\ {{\dot{E}}}= & {} \frac{{\pi } u}{m} \\ {\dot{ {p_V}}}= & {} {\left( {p_\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\right) } {p_V} + {p_\pi } \frac{\partial ^2\,U}{\partial V ^ 2} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial V\partial S}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \\ \dot{p_\pi }= & {} {\left( {p_\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\right) } {p_\pi } + \frac{d {p_\pi }}{m} - \frac{{p_E} u}{m} - \frac{{p_V}}{m} + \frac{2 \, {\pi } d}{m^{2} \frac{\partial U}{\partial S}} \\ \dot{p_E}= & {} {\left( {p_\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{{\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\right) } {p_E} \\ {\dot{S}}= & {} \frac{{\pi }^{2} d}{m^{2} \frac{\partial U}{\partial S}}\\ \dot{\Pi _V}= & {} \alpha {\Pi _V} - \frac{{\Pi _\pi }}{m}\\ \dot{\Pi _\pi }= & {} \alpha {\Pi _E} - \frac{{\Pi _\pi } u}{m}\\ \dot{\Pi _{p_V}}= & {} -{p_\pi } {p_V} {\Pi _V} \frac{\partial ^3\,U}{\partial V ^ 2\partial S} - {p_\pi } {\Pi _V} \frac{\partial ^3\,U}{\partial V ^ 3} - {p_V} {\Pi _{p_\pi }} \frac{\partial ^2\,U}{\partial V\partial S} - {p_\pi } {\Pi _{p_V}} \frac{\partial ^2\,U}{\partial V\partial S} \\{} & {} + \alpha {\Pi _{p_V}} - {\Pi _{p_\pi }} \frac{\partial ^2\,U}{\partial V ^ 2} + \frac{{\pi }^{2} d {p_V} {\Pi _V} \frac{\partial ^3\,U}{\partial V\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} - \frac{2 \, {\pi }^{2} d {p_V} {\Pi _V} \frac{\partial ^2\,U}{\partial V\partial S} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} \\{} & {} - \frac{2 \, {\pi }^{2} d {\Pi _V} \frac{\partial ^2\,U}{\partial V\partial S}^{2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} + \frac{{\pi }^{2} d {\Pi _V} \frac{\partial ^3\,U}{\partial V ^ 2\partial S}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{2 \, {\pi } d {p_V} {\Pi _\pi } \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{{\pi }^{2} d {\Pi _{p_V}} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \\{} & {} + \frac{2 \, {\pi } d {\Pi _\pi } \frac{\partial ^2\,U}{\partial V\partial S}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}\\ \dot{\Pi _{p_\pi }}= & {} -{p_\pi }^{2} {\Pi _V} \frac{\partial ^3\,U}{\partial V ^ 2\partial S} - 2 \, {p_\pi } {\Pi _{p_\pi }} \frac{\partial ^2\,U}{\partial V\partial S} + \alpha {\Pi _{p_\pi }} + \frac{{\pi }^{2} d {p_\pi } {\Pi _V} \frac{\partial ^3\,U}{\partial V\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \\{} & {} - \frac{2 \, {\pi }^{2} d {p_\pi } {\Pi _V} \frac{\partial ^2\,U}{\partial V\partial S} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} \\{} & {} - \frac{d {\Pi _{p_\pi }}}{m} + \frac{{\Pi _{p_E}} u}{m} + \frac{2 \, {\pi } d {p_\pi } {\Pi _\pi } \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{{\pi }^{2} d {\Pi _{p_\pi }} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \\{} & {} + \frac{{\Pi _{p_V}}}{m} + \frac{2 \, {\pi } d {\Pi _V} \frac{\partial ^2\,U}{\partial V\partial S}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} - \frac{2 \, d {\Pi _\pi }}{m^{2} \frac{\partial U}{\partial S}} \end{aligned}$$
$$\begin{aligned} \begin{aligned} \dot{\Pi _{p_E}}&= -{p_E} {p_\pi } {\Pi _V} \frac{\partial ^3\,U}{\partial V ^ 2\partial S} - {p_\pi } {\Pi _{p_E}} \frac{\partial ^2\,U}{\partial V\partial S} - {p_E} {\Pi _{p_\pi }} \frac{\partial ^2\,U}{\partial V\partial S} + \alpha {\Pi _{p_E}} \\&\quad + \frac{{\pi }^{2} d {p_E} {\Pi _V} \frac{\partial ^3\,U}{\partial V\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \\&\quad - \frac{2 \, {\pi }^{2} d {p_E} {\Pi _V} \frac{\partial ^2\,U}{\partial V\partial S} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} + \frac{2 \, {\pi } d {p_E} {\Pi _\pi } \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{{\pi }^{2} d {\Pi _{p_E}} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}}, \end{aligned} \end{aligned}$$
(70)

where

$$\begin{aligned} \alpha&= \frac{\partial F}{\partial S}-\Pi _j \frac{\partial X_j}{\partial S} \\&= \scriptstyle -{p_E} {p_\pi } {\Pi _{p_E}} \frac{\partial ^3\,U}{\partial V\partial S ^ 2} - {p_\pi }^{2} {\Pi _{p_\pi }} \frac{\partial ^3\,U}{\partial V\partial S ^ 2} - {p_\pi } {p_V} {\Pi _{p_V}} \frac{\partial ^3\,U}{\partial V\partial S ^ 2} - {p_\pi } {\Pi _{p_V}} \frac{\partial ^3\,U}{\partial V ^ 2\partial S} \\&\quad + {\Pi _\pi } \frac{\partial ^2\,U}{\partial V\partial S} - \frac{2 \, {\pi }^{2} d {p_E} {\Pi _{p_E}} \left( \frac{\partial ^2\,U}{\partial S ^ 2}\right) ^{2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} \\&\quad - \scriptstyle \frac{2 \, {\pi }^{2} d {p_\pi } {\Pi _{p_\pi }} \left( \frac{\partial ^2\,U}{\partial S ^ 2}\right) ^{2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} - \frac{2 \, {\pi }^{2} d {p_V} {\Pi _{p_V}} \left( \frac{\partial ^2\,U}{\partial S ^ 2}\right) ^{2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} + \frac{{\pi }^{2} d {p_E} {\Pi _{p_E}} \frac{\partial ^3\,U}{\partial S ^ 3}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{{\pi }^{2} d {p_\pi } {\Pi _{p_\pi }} \frac{\partial ^3\,U}{\partial S ^ 3}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} \\&\quad \scriptstyle + \frac{{\pi }^{2} d {p_V} {\Pi _{p_V}} \frac{\partial ^3\,U}{\partial S ^ 3}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{{\pi }^{2} d {\Pi _{p_V}} \frac{\partial ^3\,U}{\partial V\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} - \frac{2 \, {\pi }^{2} d {\Pi _{p_V}} \frac{\partial ^2\,U}{\partial V\partial S} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{3}} - \frac{ {\pi }^{2} d \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} + \frac{2 \, {\pi } d {\Pi _{p_\pi }} \frac{\partial ^2\,U}{\partial S ^ 2}}{m^{2} \left( \frac{\partial U}{\partial S}\right) ^{2}} , \end{aligned}$$

and they are subject to the constraint

$$\begin{aligned} \frac{{p_E} {\Pi _\pi }}{m} + \frac{{\pi } {\Pi _{p_E}}}{m} + {\Pi _{p_\pi }} = 0. \end{aligned}$$
(71)

9 Conclusions and Future Work

We have discussed several presentations of the so-called Optimal Control Theory, using presymplectic and contact geometry. These relations allows us to obtain directly a new proof of the equations solving the Herglotz variational principle. One of the main results is just the derivation of a Pontryagin maximum principle in the setting of Herglotz optimal control problems, a generalization of the classical optimal control. We have also exhibited how the theory can be applied to thermodynamic systems.

The results obtained in the present paper open many ways to follow, and our intention is to go in these directions; here, there are some of them:

  1. 1.

    Relations between the contact vakonomic dynamics and the Herglotz Optimal Control Problem, following the same lines that in Martínez et al. (2000, 2001) for the symplectic case.

  2. 2.

    To study the more general case of Herglotz variational calculus with constraints as in Gràcia et al. (2003) and references therein.

  3. 3.

    Reduction of the Herglotz Optimal Control Problem when we are in presence of symmetries, and reconstruction of the original solutions from the reduced ones [see Echeverría et al. (2003) and de León et al. (2004) for the classical setting].

  4. 4.

    Potential extensions to control problems with dissipation on Lie groupoids and algebroids, and numerical methods to solve them, (see Cortés et al. 2006).

  5. 5.

    Study of contact mechanical systems with controls, their stabilization and tracking problems (see for example, Cortés et al. 2002; Muñoz-Lecanda and Yániz-Fernández 2002; Cortés and Martínez 2003).