1 Introduction

In the last decades, there has been an increasing interest in plant modelling. Indeed, recent studies on how plants perceive and react to the external environment (see Chauvet et al. 2019; Bastien et al. 2013; Moulia et al. 2019; Meroz et al. 2019 for instance) led to a deeper insight into the plant growth mechanism. Current models consider proprioception (Bastien et al. 2013), internal fluxes of hormones (Moulton et al. 2020) or memory in the elaboration of the external cues (Meroz et al. 2019). Plant self-supporting structures are modelled like morphoelastic rods, whose curvatures change in time accordingly to the plant sensing activity. Furthermore, plants exhibit a great variability of the biomechanical properties intra and inter species (Rowe et al. 2004). In particular, climbing plants are a clear example of this structural variety. Consider the species Condylocarpon guianense, a common liana widely found in the flora of French Guyana. C. guianense is a twining climbing plant, which means that it reaches the canopy of the forest by twining around the branches and the trunks of its hosts. Several studies on its structure (see for instance Rowe and Speck 1996; Rowe et al. 2004; Rowe and Speck 2015 for a biological insight, or Vecchiato et al. (2023) for a modelling point of view) have revealed that in different growth stages, it changes the thickness and the nature of the layers that form its stem and consequently, it changes its stiffness. More specifically, the plant is stiffer when and where it is developing a self-supporting state, while it displays a less dense material and a thicker compliant cortex when attached to a support.

Thanks to their wide adaptability, plants became a source of inspiration for new technologies and robots (Fiorello et al. 2020; Mazzolai et al. 2014). For instance, the movement of the roots into the ground has inspired the development of new ways for soil exploration (Mazzolai 2017). And again, the efficient strategies to attach their bodies to external supports have led to the development of a robot that imitates in essence the wrapping movement and the stem stiffening of tendrils (Meder et al. 2022). In particular, in this paper, we are interested in modelling the self-supporting structures developed by climbing plants. These structures are called searcher shoots and are generated by the plant in order to explore and find support. The mechanics of the searcher shoots is an interesting and challenging study (Hattermann et al. 2022) because they exhibit both active and passive movements and must find a compromise between rigidity and flexibility to explore the surrounding environment and navigate obstacles and supports. In particular, we want to use the tools of optimal control theory to better understand how the mass is distributed along a searcher shoot. More precisely, for a given amount of mass, we want to find the best way to distribute it so as to maximize the length of the shoot and not exceed a given amount of mechanical stress. To achieve this goal, we formulate an optimal control problem for a time (length) maximization subject to mixed-state constraints. This optimization problem belongs to a classical category of problems on beam buckling. In particular, Keller et al. (1960) investigated the shape of the column that has the largest buckling load. From that work, further studies and methods were developed to solve the problem of the tallest column. Of particular note in this line of research are the works of Farjoun and Neu (2005), and Wei et al. (2012). In the former work, a symmetry of the dynamical system is employed to solve the boundary value problem related to maximizing height. In the latter work, the same technique is used to solve the problem of the tree branch with the furthest reach. In that work, the problem’s solution is studied analytically towards the tip, while the whole behaviour is displayed only via numerical simulations. Here, the problem is slightly different, since we look for a length maximization. Moreover, we employ a deeper study of the necessary conditions for the solution optimality to characterize analytically the optimal radius of the cross-section.

This work is structured in the following way. In Sect. 2, we use the theory of elastic rods to derive the differential equation of the mechanics for the searching shoot. Then, we couple this equation with the boundary and stress constraints, and a cost function. In this way, we state the optimal control problem on length maximization. In Sect. 3, we extend the dynamic by convexification of the velocities set, therefore allowing for a larger set of velocities. This relaxation of the problem allows us to prove an optimal solution’s existence. In Sect. 4, we reformulate the optimal control problem and we find a set of necessary conditions for the optimal trajectory. Then, in Sect. 5 we study the corresponding adjoint system and obtain a feedback formula for the optimal control \(u^*\). Finally, in Sect. 6, we make some numerical simulations for the optimal trajectory and the adjoint arcs based on experimental data. In those simulations, we assume that the sample in exam is maximizing its length following the principles stated in our model. This allows us to consider the length of the sample as the maximal length \(L^*\) of the optimization problem. Then, we estimate the constants of the model in order to make it fit the optimization framework. In the last sections, we discuss the results of these simulations and the analytical optimal solution. Despite the analysis of the problem is rather technical, we obtain as final result a simple relation between the curvature \(\theta '\) of the stem and the radius R of the cross-section:

$$\begin{aligned} R|\theta '| = \text { constant}. \end{aligned}$$

Such an equality means that the stem is thinner, hence more flexible, where the curvature is higher.

2 Derivation of the Model

We model the searcher shoot as an inextensible and unshearable elastic rod whose centerline is confined in a plane. Let be \(\{e_1,e_2,e_3\}\) an othonormal base for \({\mathbb {R}}^3\). Then we represent the centerline of the rod with a curve \(r \in C^2([0,L] \rightarrow {\mathbb {R}}^3)\) of length L. We parametrize the curve with its arc-length parameter \(s \in [0,L]\) and we assume that \(r \in \text {span}\{e_1, e_3\}\). For the rod cross-section, we consider as generalized frame (Goriely 2017) the Frenet mobile system \(\{ \nu , \beta , \tau \}\), formed, respectively, by the normal, binormal and tangent vectors. Naming \(\theta (s)\) the angle between \(\tau (s)\) and \(e_3\) (that is, the vertical line), the curvature \(\kappa (s)\) is simply

$$\begin{aligned} \theta '(s) = \kappa (s), \end{aligned}$$

where “\( ' \)” denotes the derivative with respect to s. Since r is confined in the plane, it is entirely described by its initial point r(0) (whose value is fixed), initial inclination \(\theta (0)\) and its curvature \(\kappa \).

We want to account for the gravity force acting on the rod r, \(\{ \nu , \beta , \tau \}\). To achieve this aim, we consider another configuration \({\hat{r}}\), \(\{ {\hat{\nu }}, {\hat{\beta }}, {\hat{\tau }} \}\) confined in \(\{e_1,e_3\}\). We refer to this latter configuration as intrinsic configuration, and it represents the shape that the rod would have in the absence of the passive elastic deflection due to its weight. Instead, we refer to the former configuration as the current configuration. In our model, we assume that the intrinsic configuration is just a straight line starting from r(0) and directed along \(e_1\). In particular, this means that we consider the intrinsic curvature \({\hat{\kappa }}\) constantly null all along the stem. In response to the gravity force, the rod generates an internal force n and an internal moment (of force) m. Assuming that the gravity force is oriented along \(-e_3\) and that it is balanced by the rod’s response, we get the following set of equations:

$$\begin{aligned} {\left\{ \begin{array}{ll} n'(s) -e_3 g \rho (s) = 0 &{} \text { in } [0,L]\\ m'(s) + r'(s) \times n(s) = 0 &{} \text { in } [0,L], \end{array}\right. } \end{aligned}$$
(1)

where g is the gravity acceleration constant and \(\rho \) is the density per unit of length of the rod. We refer to \(\rho \) as linear density. To close the set of differential equations, we need a constitutive relationship between the internal moment and the difference between the intrinsic and the current curvature, \({\hat{\kappa }} \equiv 0\) and \(\kappa \), respectively. The Euler–Bernoulli law provides a classical relationship of this kind, which in the planar case reads the following form:

$$\begin{aligned} m(s) = E(s)I(s) \kappa (s) \text { for } s \in [0,L]. \end{aligned}$$
(2)

In this equation, the quantity E is the Young’s modulus and it measures the stiffness of the rod. Furthermore, the quantity I is the second moment of area of the cross-section along the direction given by the binormal vector \(\beta \). In our case, we consider a circular cross-section of radius R. So, we have

$$\begin{aligned} I(s) = \frac{\pi }{4}R(s). \end{aligned}$$

Considering together system (1) and Eq. (2), we obtain

$$\begin{aligned} -(E(s) I(s) \theta '(s))' = \sin \theta (s) \, g \int _s^L \rho (t) \textrm{d} t, \end{aligned}$$
(3)

where we are also assuming that there is not any external load at the rod’s tip. This means that \(n(L) = 0\).

2.1 Formulation of the Problem

As expressed by Eq. (1), when an elastic rod is subject to external forces, the material generates the internal moment m in response (Goodno et al. 2020). This moment causes the deflection of the rod. The internal pressure that generates m is called (bending) stress and we name it \(\sigma \). In the case of a circular cross-section, the maximal stress developed by the rod at r(s) is

$$\begin{aligned} \sigma _M(s) = |\theta '(s)|R(s) E(s). \end{aligned}$$

We assume that the maximal stress \(\sigma _M\) cannot cross a certain fixed threshold \({\bar{\sigma }}\). The mass of the shoot is represented by the linear density \(\rho \) of the main stem, which is related to the density per unit of volume \(\rho _3\) and the radius R by the equation

$$\begin{aligned} \rho (s) = \rho _3(s) \pi R^2(s). \end{aligned}$$

To formulate our problem, we consider the volume density \(\rho _3\) and the Young’s modulus E constant along the shoot. Furthermore, we assume that the main stem does not have any secondary branches or leaves. Then, we represent the optimal control problem of the shoot length maximization with the following system:

$$\begin{aligned} {\left\{ \begin{array}{ll} \max _{(R,L) \in {\mathcal {U}}} L \\ \text {subject to} \\ ( - R^4 \theta '(s) )' = \frac{4g}{\pi E} \sin \theta (s) \left[ \int _s^L \rho _3 \pi R^2(\sigma ) \textrm{d}\sigma \right] \\ \theta (0) = \theta _0 \\ \theta '(L) = 0 \\ \int _0^L R^2(\sigma ) \textrm{d}\sigma = M \\ |\theta '(s)| \le \frac{c_2}{R(s)}, \end{array}\right. } \end{aligned}$$
(4)

where

$$\begin{aligned} c_2 = \frac{ {\bar{\sigma }} }{E}. \end{aligned}$$

The boundary conditions mean that we are fixing an initial inclination of the rod equal to \(\theta _0\) and that we are considering just the weight of the rod, without any extra element at the tip, so the intrinsic and the current curvatures coincide. The total mass of the main stem is given by

$$\begin{aligned} \int _s^L \rho _3 \pi R^2(\sigma ) \textrm{d}\sigma . \end{aligned}$$

In the above expression, \(\rho _3\) is a constant, hence the constraint \(\int _0^L R^2(\sigma ) \textrm{d}\sigma = M\) means that we are fixing the total mass to the value \(\rho _3 \pi M\). The set of the controls \({\mathcal {U}}\) will be specified later on in the presentation.

We introduce the variables:

$$\begin{aligned} \begin{aligned} \mu (s)&= \int _s^L R^2(\sigma ) \textrm{d}\sigma , \\ \psi&= - R^4\theta ', \\ u&= R^2, \\ c_1&= \frac{4 g \rho _3}{E}. \end{aligned} \end{aligned}$$

To avoid pathological and rather unrealistic situations, we assume to have an upper bound and a lower bound on the variable u, that is \(u_M \ge u \ge u_m\). Using these new variables and conditions, we obtain from (4) the optimization problem:

figure a

where

  • \(\theta _0 \in [0, 2 \pi ]\) and \(c_1,c_2, u_m, u_M, M \in (0,+\infty )\);

  • \(x = (\psi , \theta , \mu ) \in W^{1,1}([0,L];{\mathbb {R}}^3)\);

  • \({\mathcal {U}} = \{ (u,L) \,: \, L \ge 0 \text { and } u:[0,L]\rightarrow {\mathbb {R}} \;\;\textrm{Lebesgue}\;\;\textrm{measurable} \}\);

  • \(f_P(x,u) = (c_1 \mu \sin \theta , -\psi /u^2, -u)\);

  • \(C_{P,0} = {\mathbb {R}} \times \{ \theta _0 \} \times \{ M \}\);

  • \(C_{P,1} = \{ 0 \} \times {\mathbb {R}} \times \{ 0 \}\);

  • \(h_{P,1}(x,u) = \max \{u_m - u,u - u_M\}\);

  • \(h_{P,2}(x,u) = |\psi | - c_2 u^{3/2}\).

We will see that the upper bound \(u_M\) for u is a fundamental assumption to prove the existence of an optimal solution in the convexified case. In general, the imposition of an upper bound on the set of controls may influence the solution to the optimization problem itself. As we will see, if \(u_M\) is large enough, this is not the case.

2.2 Notation for Multifunctions

In the following, we denote with

$$\begin{aligned} F: X \subset {\mathbb {R}}^n \rightrightarrows {\mathbb {R}}^m \end{aligned}$$

a multifunction from \( X \subset {\mathbb {R}}^n\) to \({\mathbb {R}}^m\), that is a function whose domain is X and such that \(F(x) \subset {\mathbb {R}}^m\) for every \(x \in X\). The multifunction F is said Borel measurable if for any open set \(A \subset {\mathbb {R}}^m\), the set

$$\begin{aligned} F^{-1}(A) = \{ x \in X \,: \, F(x) \cap A \ne \emptyset \} \end{aligned}$$

is a Borel subset of \(X \subset {\mathbb {R}}^n\). Moreover, we say that the multifunction F is closed, convex or nonempty if for any \(x \in X\) the set \(F(x) \subset {\mathbb {R}}^m\) is, respectively, closed, convex or nonempty. We say that F is uniformly bounded if there exist a constant \(\alpha > 0\) such that

$$\begin{aligned} F(x) \subset \alpha {\mathbb {B}}^m \end{aligned}$$

for any \(x \in X\). With \({\mathbb {B}}^m\) we denote the closed unitary ball centred at the origin of \({\mathbb {R}}^m\). The graph of F is the set

$$\begin{aligned} \{ (x,y) \,: \, x \in X \,, \, y \in F(x)\} \subset {\mathbb {R}}^n \times {\mathbb {R}}^m \end{aligned}$$

3 Existence of Optimal Solutions

In this section, we construct a “minimal" modification of problem (P) in order to obtain an optimal control problem for which the existence of an optimal solution is guaranteed. To this aim, we first construct such an enlarged optimal control problem in Sect. 3.1 and then we show that the latter has a feasible solution in Sect. 3.2.

3.1 Relaxation

To start with, we observe that the dynamics of the optimal control problem (P) can be reformulated in terms of a differential inclusion by defining

$$\begin{aligned} {\dot{x}}(s) \in F(x(s)) \end{aligned}$$

with

$$\begin{aligned} F(x) = \left\{ (c_1 \mu \sin (\theta ), - \psi /u^2, -u) \,: \, u \in U(x) \right\} \end{aligned}$$

where

$$\begin{aligned} U(x) = \left\{ u \in [u_m,u_M], u \ge \left( \frac{|\psi |}{c_2} \right) ^{2/3} \right\} . \end{aligned}$$

It is well known that a key assumption for the existence of a solution for an optimal control problem is the convexity on the set of admissible velocities F (see e.g. Vinter et al. (2010)). However, it is easy to see in our case that such a standard existence hypothesis is not verified. A standard procedure to overcome such an issue, known as relaxation, is to enlarge the set of admissible trajectories in a way that the existence of a maximum is guaranteed. To this aim, we consider the convexified version of problem (P):

figure b

Here we are using the notation:

  • \(x = (\psi ,\theta ,\mu ) \in W^{1,1}([0,L];{\mathbb {R}}^3)\);

  • \(F_d(x) = {\overline{co}}\left\{ F(x) \right\} \);

  • \(C_{d,0} = {\mathbb {R}} \times \{ \theta _0 \} \times \{ M \} \)

  • \(C_{d,1} = \{ 0 \} \times {\mathbb {R}} \times \{ 0 \}\)

We say that \((x,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times [0,+\infty )\) is a trajectory for (\(p_d\)) if it satisfies the differential inclusion, that is \(x'(s) \in F_{d}(x(s))\) for a.e. \(s \in [0,L]\). A trajectory for (\(p_d\)) is admissible if it satisfies also the constraint \((x(0),x(L)) \in C_{d,0} \times C_{d,1}\). Analogously, we say that \((x,u,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times {\mathcal {U}}\) is a trajectory for (P) if \(x' = f_P(x,u)\) a.e. in [0, L] and it is admissible if all the constraints are satisfied. By construction, we observe that if (xuL) is an admissible trajectory for (P) then (xL) is an admissible trajectory for (\(p_d\)).

Analogue terms are used for problem (\(p_c\)) in Sect. 4.1.

Proposition 1

The multifunction \(F_d\) has the following characterisation:

$$\begin{aligned} \begin{aligned} F_d(x)&= \{ c_1 \mu \sin (\theta ) \} \\&\quad \times \left\{ \left( -\psi \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2} \right) , -(\lambda _1 u_1 + \lambda _2 u_2) \right) \,: \, (u_1,u_2,\lambda _1,\lambda _2) \in V(x) \right\} , \end{aligned} \end{aligned}$$

where

$$\begin{aligned} V(x) = \left\{ (u_1,u_2,\lambda _1,\lambda _2) \,: \, u_1,u_2 \in U(x)\quad \text {and}\quad \lambda _1,\lambda _2 \in [0,1], \, \lambda _1 + \lambda _2 = 1 \right\} . \end{aligned}$$

Proof

To prove Proposition 1, first we make the following consideration. Fix a point \(x = (\psi , \theta ,\mu ) \in {\mathbb {R}}^3\) and define

$$\begin{aligned} u_0 = \max \left( u_m, \left( \frac{|\psi |}{u_m}\right) ^{2/3} \right) . \end{aligned}$$

Then

$$\begin{aligned} F_d(x) = \{c_1 \mu \sin (\theta ) \} \times {\overline{co}}\left\{ \left( \frac{\psi }{u^2}, u \right) \,: \, u \in [u_0,u_M]\right\} . \end{aligned}$$

Hence, we can focus on the set

$$\begin{aligned} A:={\overline{co}}\left\{ \left( \frac{\psi }{u^2}, u \right) \,: \, u \in [u_0,u_M]\right\} . \end{aligned}$$

We want to prove that

$$\begin{aligned} A = B:= \left\{ \left( \lambda _1 u_1 + \lambda _2 u_2, \psi \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2}\right) \right) \,: \, (u_1,u_2,\lambda _1,\lambda _2) \in V(x) \right\} . \end{aligned}$$

Let us use g(u) to denote the function

$$\begin{aligned} \begin{aligned} g&: [u_0,u_M] \rightarrow {\mathbb {R}} \\ g&(u) = \frac{\psi }{u^2} \end{aligned} \end{aligned}$$

and with \(\ell : [u_0,u_M] \rightarrow {\mathbb {R}}\) the straight line that joins the point \((u_0,g(u_0))\) to the point \((u_M,g(u_M))\), that is

$$\begin{aligned} \ell (u) = g(u_0) + \left( \frac{g(u_M) - g(u_0)}{u_M - u_0} \right) (u - u_0). \end{aligned}$$

Define

$$\begin{aligned} C:= epi(g) \cap hyp(\ell ), \end{aligned}$$

where

$$\begin{aligned} epi(g)=\{(u,\alpha ): \; u\in [u_0,u_M], \;\;\; g(u)\le \alpha \} \end{aligned}$$

and

$$\begin{aligned} hyp(l)=\{(u,\beta ): \; u\in [u_0,u_M], \;\;\; \ell (u)\ge \beta \}. \end{aligned}$$

Since g is a convex and continuous function and \(hyp(\ell )\) is a convex set, C is a convex closed set containing \(\left\{ \left( \frac{\psi }{u^2}, u \right) \,: \, u \in [u_0,u_M]\right\} \). In particular, one has that \(A \subseteq C\).

Step 1: \(C \subseteq B\).

Consider \(({\bar{u}},{\bar{v}}) \in C\). Then one has that

$$\begin{aligned} g({\bar{u}}) \le {\bar{v}} \le \ell ({\bar{u}}). \end{aligned}$$

Take now the straight line \(\ell _0:[u_0,u_M] \rightarrow {\mathbb {R}}\) with the same slope of \(\ell \) and such that \(\ell _0({\bar{u}}) = {\bar{v}}\), that is

$$\begin{aligned} \ell _0(u) = {\bar{v}} + \left( \frac{g(u_M) - g(u_0)}{u_M - u_0} \right) (u - {\bar{u}}). \end{aligned}$$

By construction, one has that

$$\begin{aligned} \begin{aligned} \ell _0(u_0)&\le \ell (u_0) = g(u_0), \\ \ell _0(u_M)&\le \ell (u_M) = g(u_M). \end{aligned} \end{aligned}$$

Hence, by using the continuity \(\ell _0\), there exist \(u_1\) and \(u_2\) satisfying \(u_1 \le {\bar{u}} \le u_2\) and such that

$$\begin{aligned} \begin{aligned} g(u_1)&= \ell _0(u_1) \\ g(u_2)&= \ell _0(u_2) \end{aligned} \end{aligned}$$

which means that the point \(({\bar{u}},{\bar{v}})\) is a convex combination of the points

$$\begin{aligned} (u_1,g(u_1)),(u_2,g(u_2)). \end{aligned}$$

This shows that \(C\subseteq B\).

Step 2: \(A = B\).

Since it follows from Step 1 that \(C \subseteq B\), one also has \(A \subseteq B\). On the other hand, it follows from the definition of B that also the inclusion \(B \subseteq A\) holds. Hence one has that \(A = B\). This completes the proof. \(\square \)

3.2 Existence of Relaxed Optimal Solutions

In this section, we will show the existence of an optimal solution for (\(p_d\)). To achieve this result, we begin by proving the existence of at least one admissible trajectory.

Proposition 2

Fix the constants

$$\begin{aligned} c_1,c_2,u_m,M \in (0,+\infty ) \text { and } \theta _0 \in [0, 2 \pi ] \end{aligned}$$

and choose

figure c

Then there exists an admissible trajectory (xuL) for (P).

Consequently, also problem (\(p_d\)) has at least one admissible trajectory.

Proof

To prove the statement, we make use of a fixed point argument. Let us fix the constant control \(u = M/L\); for \(L \le M/u_m\) we have \(u \ge u_m\), which implies that the lower bound given by \(h_{P,1}\) is satisfied. Now, let us consider \(\mu (t) = (L - t)M/L\). It is easy to observe that the trajectory \(((\psi , \theta + \theta _0, \mu ), M/L, L)\) is admissible for problem (P) if and only if \((\psi ,\theta , \mu )\) solves the system

$$\begin{aligned} {\left\{ \begin{array}{ll} \psi ' = c_1 \frac{M (L - t)}{L} \sin (\theta + \theta _0) \\ \theta ' = - \psi \frac{L^2}{M^2} \\ \psi (L) = 0 \\ \theta (0) = 0 \\ |\psi | \le c_2 u^{3/2} \end{array}\right. } \end{aligned}$$
(5)

for all \(t\in [0,L]\). We define \(X = \{ \psi ,\theta \in C([0,L]): \, \psi (L) = \theta (0) = 0 \}\). Then \((X,|| \cdot ||_{\infty })\)Footnote 1 is a Banach space. Consider the function \(F: X \rightarrow X\)

$$\begin{aligned} \begin{aligned} F(\psi ,\theta )(s)&= \left( -\int _s^L c_1 \frac{M (L - t)}{L} \sin (\theta (t) + \theta _0) \textrm{d}t, - \int _0^s \psi (t) \frac{L^2}{M^2} \textrm{d}t \right) \\&= (F_1(\psi , \theta ), F_2(\psi , \theta )) \end{aligned} \end{aligned}$$

To prove the existence of a solution to system (5) we just need to prove that for L small enough, F is a contraction and the inequality for \(|\psi |\) is satisfied.

Let be \((\psi ,\theta ), ({\bar{\psi }},{\bar{\theta }}) \in X\).

$$\begin{aligned} \begin{aligned} ||F_1(\psi ,\theta ) - F_1({\bar{\psi }},{\bar{\theta }}) ||_{\infty }&\le c_1 L M ||\theta - {\bar{\theta }}||_{\infty } \\ ||F_2(\psi ,\theta ) - F_2({\bar{\psi }},{\bar{\theta }}) ||_{\infty }&\le \frac{L^3}{M^2} ||\psi - {\bar{\psi }}||_{\infty } \end{aligned} \end{aligned}$$

This means that for \(L < \min \left\{ \frac{1}{c_1 M}, M^{2/3} \right\} \), F is a contraction. Moreover, we notice that \(||\psi '||_{\infty } \le c_1 M\). The bound on the derivative and the terminal condition let us notice that \(||\psi ||_{\infty } \le c_1 L M\). Then, the inequality on \(\psi \) is always satisfied if

$$\begin{aligned} c_1 L M \le c_2 \left( M / L \right) ^{3/2} \end{aligned}$$

which gives the upper bound \(L \le \left( \frac{c_2}{c_1} \sqrt{M} \right) ^{2/5}\).

Collecting all the upper bounds for L, we can define

$$\begin{aligned} \bar{L} = \min \left( \frac{M}{u_m}, \frac{1}{c_1 M}, M^{2/3}, \left( \frac{c_2}{c_1} \sqrt{M} \right) ^{2/5} \right) \end{aligned}$$

and we can take \(u = M/{\bar{L}}\). So, if condition (\({HP_{max}}\)) is satisfied, then \(u \le u_M\). That is, the trajectory \(((\psi ,\theta + \theta _0,\mu ), u, {\bar{L}})\) is admissible for (P) and consequently \(((\psi ,\theta + \theta _0,\mu ), {\bar{L}})\) is an admissible solution for (\(p_d\)). \(\square \)

To prove the existence of a minimizing trajectory we need a bound on the initial condition of the optimal trajectory. This property descends from the limited mass M at our disposal and the boundedness of the dynamic.

Proposition 3

The set of admissible trajectories for (\(p_d\)) does not change if we replace \(C_{d,0}\) defined in (\(p_d\)) with

$$\begin{aligned} {{\tilde{C}}}_{d,0} = \left[ -\frac{c_1M^2}{u_m}, \frac{c_1M^2}{u_m}\right] \times \{ \theta _0 \} \times \{ M \} \end{aligned}$$

Moreover, for any admissible trajectory \((x,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times [0, +\infty )\) we have

$$\begin{aligned} \begin{array}{ll} L \in \left( 0, \frac{M}{u_m} \right] , &{} |\psi (s)| \le \frac{c_1M^2}{u_m}, \\ |\theta (s) - \theta _0| \le c_1 \left( \frac{M}{u_m}\right) ^3, &{} 0 \le \mu (s) \le M, \end{array} \end{aligned}$$

for every \(s \in [0,L]\).

Proof

Let \(x = (\psi ,\theta ,\mu )\) be an admissible trajectory for (\(p_d\)). Since \(\mu (0) = M > 0 = \mu (L)\), we must have \(L > 0\). Moreover, one has that

$$\begin{aligned} M = - \int _0^L \mu '(\sigma ) \textrm{d}\sigma \ge u_m L, \end{aligned}$$

which implies the bound

$$\begin{aligned} L \le \frac{M}{u_m}. \end{aligned}$$

The bound on \(\mu \) follows immediately from the dynamics. Indeed, by definition of \(F_d\), one has \(\mu ' \le 0\). Hence

$$\begin{aligned} 0 \le - \int _s^L \mu '(\sigma ) \textrm{d}\sigma = \mu (s) = M + \int _0^s \mu '(\sigma ) \textrm{d}\sigma \le M \end{aligned}$$

For what concerns the variable \(\psi \), we have

$$\begin{aligned} \begin{aligned} |\psi (s)| = \left| \int _s^L c_1 \mu (\sigma ) \sin (\theta (\sigma )) \textrm{d}\sigma \right| \le c_1 M L \le c_1 \frac{M^2}{u_m}, \end{aligned} \end{aligned}$$

which gives the bound for \(\psi \) in [0, L].

Finally, by taking into account the equation for \(\theta '\), we observe that

$$\begin{aligned} |\theta (s) - \theta _0| \le \int _{0}^s \theta '(\sigma ) \textrm{d}\sigma \le \frac{c_1 M^2}{u_m}\frac{1}{u_m^2} \frac{M}{u_m}=c_1\frac{M^3}{u_m^4}. \end{aligned}$$

Hence, all the bounds of the thesis are verified. \(\square \)

To prove the existence of a maximizer we employ a compactness theorem stated in Proposition 2.5.3 in Vinter et al. (2010), and the bound on the admissible lengths L stated in Proposition 3.

Theorem 1

Assume (\({HP_{max}}\)) holds. Then, problem (\(p_d\)) admits a maximizer.

Proof

Define the closed and bounded set \(X \subset {\mathbb {R}}^3\),

$$\begin{aligned} X = \left[ -\frac{c_1M^2}{u_m}, \frac{c_1M^2}{u_m}\right] \times \left[ - |\theta _0| - \frac{c_1 M^3}{u_m^3}, |\theta _0| + \frac{c_1 M^3}{u_m^3}\right] \times \left[ 0, M \right] \end{aligned}$$

In view of Proposition 3, for any admissible trajectory \((x,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times [0,+\infty )\), one has that \((\psi , \theta , \mu )(t)\in X\) for all \(t\in [0,L]\).

Step 1: \(F_d: {\mathbb {R}}^3 \rightrightarrows {\mathbb {R}}^3\) is a closed, convex, nonempty, Borel measurable multifunction.

\(F_d\) is closed, convex and nonempty by definition. Concerning the Borel measurability, define

$$\begin{aligned} f_{u_1,u_2,\lambda }(x): = \left( c_1 \mu \sin (\theta ), -\psi \left( \frac{\lambda }{u_1^2} + \frac{1 - \lambda }{u_2^2} \right) , -(\lambda u_1 + (1 - \lambda ) u_2) \right) . \end{aligned}$$

Take any open set \(A \subset {\mathbb {R}}^3\). By taking into account Proposition 1 and the continuity of \(f_{u_1,u_2,\lambda }(x)\) with respect \(u_1,u_2,\lambda \) and x, we have

$$\begin{aligned} \begin{aligned} F_d^{-1}(A)&= \{ x \in {\mathbb {R}}^3 \,: \, F_{d}(x) \cap A \ne \emptyset \} \\&= \bigcup _{u_1 \le u_2 \in {\mathbb {Q}} \cap [u_m,u_M]} \bigcup _{\lambda \in {\mathbb {Q}} \cap [0,1]} \left[ f_{u_1,u_2,\lambda }^{-1}(A) \cap \left( (-\infty , c_2 u_1^{3/2}] \times {\mathbb {R}}^2 \right) \right] \end{aligned} \end{aligned}$$

Hence, \(F^{-1}(A)\) is a countable union of Borel sets, that is a Borel set itself.

Step 2: \(F_d\) has a closed graph. Let \((x_n)_n\) and \((v_n)_v\) two sequences in \({\mathbb {R}}^3\) such that for each n,

$$\begin{aligned} v_n \in F_d(x_n) \end{aligned}$$

and \(x_n \rightarrow x\), \(v_n \rightarrow v\) for some \(x,v \in {\mathbb {R}}^3\). We want to prove that \(v \in F_{d}(x)\).

It follows from Proposition 1 that, for each n, there exist \(u_{1,n}, u_{2,n}, \lambda _n\) such that

$$\begin{aligned} f_{u_{1,n},u_{2,n},\lambda _n} (x_n) = v_n. \end{aligned}$$

Since \(u_{1,n},u_{2,n} \in [u_m,u_M]\) and \(\lambda _n \in [0,1]\) for each n, then there exist \(u_1,u_2 \in [u_m,u_M]\) and \(\lambda \in [0,1]\) such that \(u_{1,n} \rightarrow u_1\), \(u_{2,n} \rightarrow u_2\) and \(\lambda _n \rightarrow \lambda \) at least along a subsequence. Thus, by continuity, we have

$$\begin{aligned} \begin{aligned} f_{u_1,u_2,\lambda }(x)&= v, \\ u_1, u_2&\ge \left( \frac{|\psi |}{c_2}\right) ^{2/3}, \end{aligned} \end{aligned}$$

that is exactly \(v \in F_{d}(x)\).

Step 3: there exists \(\alpha > 0\) such that \(F_{d}(x) \subset \alpha {\mathbb {B}}^3\) for any \(x \in X\). It follows from the definition of X and the boundedness of the control set that, for any \(x \in X\), we have

$$\begin{aligned} F_d(x) \subset \max \left\{ c_1 \frac{M^2}{u_m}, c_1 \frac{M^3}{u_m^3}, \frac{M u_M}{u_m} \right\} {\mathbb {B}}^3 \end{aligned}$$

where we recall that \({\mathbb {B}}^3 \subset {\mathbb {R}}^3\) is the closed unitary ball centred at the origin.

Step 4: Passing to the limit from a maximizing sequence.

Consider a maximizing sequence of admissible trajectories \((x_n,L_n)_n\) for problem (\(p_d\)). In view of Proposition 2, such a sequence exists. It follows from Proposition 3 that the end-time sequence \((L_n)_n\) is bounded. So, along a subsequence (we do not relabel), there exists \(L > 0\) such that \(L_n \rightarrow L\). Furthermore, it is not restrictive to assume that \(L_n \le L\) for all n sufficiently large along a subsequence (the case \(L_n \ge L\) can be treated by using similar arguments). Hence, we can extend \(x_n\) to the whole [0, L] by defining

$$\begin{aligned} y_n(s) = {\left\{ \begin{array}{ll} x_n(s) &{} s \in [0,L_n] \\ x_n(L_n) &{} s \in [L_n,L] \end{array}\right. }, \end{aligned}$$

so that \(y_n \in W^{1,1}([0,L];{\mathbb {R}}^3)\) with \(y'_n = x'_n\) in \([0,L_n]\) and \(y'_n \equiv 0\) in \([L_n,L]\). It follows from Proposition 3 that \(y_n \in X\) for every n. Since \(F_d\) restricted to X is bounded, \(y'_n \in F(y_n)\) a.e. in \([0,L_n]\) and \(y'_n \equiv 0\) in \([L_n,L]\) for every n, the sequence \((y'_n)_n\) is uniformly essentially bounded. Moreover, by invoking again Proposition 3, one has that \((y_n(0))_n\) is a bounded sequence.

Let us define \(A_n = [0,L_n]\) and observe that \({\mathcal {L}}(A_n) = L_n \rightarrow L\) as \(n \rightarrow +\infty \). It follows from Proposition 2.5.3 of Vinter et al. (2010) that there exists a function \(x \in W^{1,1}([0,L];{\mathbb {R}}^3)\) such that \(x' \in F(x)\) a.e. in [0, L] and \((x(0),x(L)) \in C_{0,d} \times C_{1,d}\), that is (xL) is an admissible trajectory for (\(p_d\)). Since L is the maximal value for problem (\(p_d\)), then (xL) is a maximizer for (\(p_d\)). This concludes the proof. \(\square \)

4 Necessary Conditions

4.1 Reformulation

Proposition 1 characterizes the velocity set \(F_d\). Using such a characterization, we recast Problem (\(p_d\)) into the following system:

figure d

with

  • \(x = (\psi , \theta , \mu ) \in W^{1,1}([0,L];{\mathbb {R}}^3)\);

  • \(u = (u_1,u_2,\lambda )\);

  • \({\mathcal {V}} = \{(u, L):\, \, u:[0,L]\rightarrow {\mathbb {R}}\times {\mathbb {R}}\times [0,1] \;\;\textrm{Lebesgue}\; \textrm{measurable}, \;\) \(\text { }L \in [0,+\infty )\}\);

  • \(f_c(x) = \left( c_1 \mu \sin (\theta ),- \psi \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2} \right) ,- (\lambda _1 u_1 + \lambda _2 u_2) \right) \) with \(\lambda _1 = \lambda \) and \(\lambda _2 = 1 - \lambda _1\);

  • \(h_c(x,u) = \max \left\{ |\psi | - c_2 u_i^{3/2}, u_m - u_i, u_i - u_M, - \lambda , \lambda - 1 \right\} ,\)

    where the use of the subscript i means that the maximum is taken over both \(i = 1,2\).

In view of Proposition 9 (see Appendix), problem (\(p_c\)) is equivalent to problem (\(p_d\)).

Proposition 4

Let (xuL) and admissible trajectory for problem (\(p_c\)) such that \(\theta (0) = \pi /2\). Then if one of the following two conditions holds true

figure e

we have:

  • \(\theta \) is increasing and \(\theta (s) \in [\pi /2,\pi )\) for all \(s \in [0,L]\);

  • \(\psi \) is increasing and \(\psi (s) < 0 \) for all s in [0, L).

Proof

Assume \(u_1 \le u_2\). From the velocity constraint \(f_p\) we observe that

$$\begin{aligned} \theta ' = - \psi \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2}\right) . \end{aligned}$$
(6)

Furthermore, from the state constraint, we have that

$$\begin{aligned} |\psi | \le c_2 u_1^{3/2}. \end{aligned}$$
(7)

Consequently it follows from (6), (7) and from Proposition 3 that

$$\begin{aligned} |\theta '| \le c_2 u_1^{3/2} \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2}\right) \le \frac{c_2}{\sqrt{u_1}}. \end{aligned}$$

So, if \(c_2 < \frac{\pi \sqrt{u_m}}{2L}\), for every \(s \in [0,1]\)

$$\begin{aligned} \left| \theta (s) - \frac{\pi }{2}\right| \le \frac{c_2 L}{\sqrt{u_1}} \le c_2 \frac{L}{\sqrt{u_m}} \le \frac{\pi }{2} \end{aligned}$$

and we get that \(\theta \in (0,\pi )\). On the other hand, if \(c_1 < \frac{\pi }{2} \left( \frac{u_m}{M} \right) ^3\), by proposition 3 we obtain the same conclusion.

The bound \(\theta \in (0,\pi )\) determines the signs of \(\psi '\) and \(\psi \):

$$\begin{aligned} \begin{aligned} \psi '(s)&= c_1 \mu (s) \sin (\theta (s)) \ge 0,\\ \psi (s)&= - \int _s^L c_1 \mu (\sigma ) \sin (\theta (\sigma )) \textrm{d} \sigma \le 0. \end{aligned} \end{aligned}$$

Using the above relations, we refine the bound on \(\theta \) and get the monotonicity property:

$$\begin{aligned} \begin{aligned}&\theta '(s) = - \psi (s) \left( \frac{\lambda _1(s)}{u_1^2(s)} + \frac{\lambda _2(s)}{u_2^2(s)} \right) \ge 0,\\&\theta (s) - \frac{\pi }{2} = \int _0^s (- \psi )(\sigma ) \left( \frac{\lambda _1(\sigma )}{u_1^2(\sigma )} + \frac{\lambda _2(\sigma )}{u_2^2(\sigma )} \right) \textrm{d}\sigma \ge 0. \end{aligned} \end{aligned}$$

which means that \(\theta \) is increasing and \(\theta \in (\pi /2, \pi )\), concluding the proof. \(\square \)

Remark 1

In Proposition 4 the condition for the constant \(c_2\) depends on the length L of the admissible trajectory. However, using the upper bound on L given in Proposition 3, it is possible to obtain

$$\begin{aligned} c_2 < \frac{\pi u_m^{3/2}}{2M} \end{aligned}$$

so that if \(c_2\) satisfies this condition, then (\(HP_2c\)) holds for any admissible trajectory.

4.2 Notations for Basic Non-smooth Analysis

The mixed constraint \(h_c\) leads to a formulation of the Pontryagin’s maximum principle that involves just absolutely continuous adjoint trajectories. This version of the Pontryagin’s maximum principle can be found in theorem 2.1 in Clarke et al. (2010). Before discussing the necessary conditions for the optimization problem (\(p_c\)), we fix some essential notations of non-smooth analysis.

Given a non-empty closet set \(S \subset {\mathbb {R}}^n\), we define the proximal normal cone to S at a point \(x \in S\) as

$$\begin{aligned} N_S^P(x) = \{ \xi \in {\mathbb {R}}^n \,: \, \exists M > 0 \text { such that } \forall x' \in S \,, \, \langle \xi , x' - x \rangle \le M |x' - x|^2 \}. \end{aligned}$$

Any element of the proximal normal cone is called proximal normal vector. We define the limiting normal cone (also known as Mordukhovich’s normal cone) as

$$\begin{aligned} N_S^L(x) = \{ \xi \,: \, \xi = \lim _{i} \xi _i \text { for any } \xi _i \in N_S^P(x_i) \text { such that } x_i \in S \,, \, x_i \rightarrow x \} \end{aligned}$$

and we also define the generalized normal cone (also known as Clarke’s normal cone) as

$$\begin{aligned} N_S^C(x) = {\overline{co}}N_S^L(x). \end{aligned}$$

It is clear from the definitions that

$$\begin{aligned} N_S^P(s) \subset N_S^L(x) \subset N_S^C(x). \end{aligned}$$

In an analogous way, we define proximal, limiting and generalized subgradients for a lower semicontinuous function \(f: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\):

$$\begin{aligned} \begin{aligned} \partial ^P f(x)&= \{ \xi \,: \, (\xi , -1) \in N_{epi(f)}^P(x) \}, \\ \partial ^L f(x)&= \{ \xi \,: \, \xi = \lim _{i} \xi _i \text { for any } \xi _i \in \partial ^Pf(x_i) \text { s.t. } x_i \rightarrow x \,, \, f(x_i) \rightarrow f(x) \}, \\ \partial ^Cf(x)&= co\{\partial ^L f(x)\} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \partial ^P f(x) \subset \partial ^L f(x) \subset \partial ^C f(x). \end{aligned}$$

There is a well-known relation between the level sets of a function and its subgradient, which is stated in the following theorem (see Theorem 11.38 in Clarke et al. 2013).

Theorem 2

Let f be a locally Lipschitz function and define

$$\begin{aligned} S = \{ x \,: \, f(x) \le 0\}. \end{aligned}$$

Fix \(x \in S \) such that \(f(x) = 0\). If \(0 \notin \partial ^L f\), then

$$\begin{aligned} N_S^L(x) \subset \{ \lambda \xi \,: \, \lambda \ge 0 \,, \, \xi \in \partial ^L f \}. \end{aligned}$$

The proof of this statement can be found in theorem 11.38 in Clarke et al. (2013). Another useful theorem is the so-called max rule (see for instance theorem 5.5.2 in Vinter et al. 2010)

Theorem 3

Let \(f_i: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\), \(i = 1,...,m\) a collection of m locally Lipschitz continuous functions. Define \(f: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) as

$$\begin{aligned} f(x) = \max _i f_i(x) \end{aligned}$$

and

$$\begin{aligned} \Lambda = \{ (\lambda _1,...,\lambda _m) \in {\mathbb {R}}^m \,: \, \lambda _i \ge 0 \text { for every } i = 1,...,m \text { and } \lambda _1 +... + \lambda _m = 1\}. \end{aligned}$$

Then

$$\begin{aligned} \partial ^L f(x) \subset \left\{ \partial ^L \left[ \sum _{i = 1}^m \lambda _i f_i \right] (x) \,: \, (\lambda _1,..., \lambda _m) \in \Lambda \text {, and } f_i(x) < f(x) \implies \lambda _i = 0 \right\} . \end{aligned}$$

4.3 A Pontryagin’s Maximum Principle

4.3.1 The Bounded Slope Condition

To derive the necessary conditions for (\(p_d\)), it is important to verify the Lipschitz continuity and the boundedness of \(F_d\) (see for instance section 2.3 in Clarke et al. (2005)). These properties are implied by the bounded slope condition, which requires a relation between the partial derivatives of \(F_d\). A single-valued function is easier to manage than a multifunction, so we look at this condition for problem (\(p_c\)). A single-valued formulation of the bounded slope condition can be found in condition \(\varvec{BS_*^{\varepsilon ,R}}\) of Clarke et al. (2010), which we summarize here.

Fig. 1
figure 1

Constraint set given by the conditions \(u \in [u_m,u_M]\) and \(|\psi | \le c_2 |u|^{3/2}\). The black arrows represent the normal vectors to the edges of the sets; in the corners ABCD, since we have a discontinuity, those normal vectors become normal cones. The value \(u_{\max }\) is the one given by condition (\({HP_{\max }}\)), while \(\psi _{\max }\) is the corresponding maximum value of \(\psi \), that is \(\psi _{\max } = (c_1 M^2)/u_m\). The bounded slope condition in this situation can be translated in this geometrical concept: there must not be arrows vertically oriented near the optimal trajectory. As shown by the orange arrows in the figure, corners B and C are clearly a threat to this condition. However, since for any admissible trajectory we have \(|\psi | \le \psi _{\max }\), the optimal trajectory (and more in general, any admissible trajectory) will never reach the green area of the figure

Assume we are considering an optimisation problem whose admissible trajectories (xu) (where u is the control) satisfy the constraint \((t,x(t),u(t)) \in S\), where S is a closed set. Let \((x^*,u^*, L^*)\) be an optimal process and \(R(t) > 0\) an arbitrary positive measurable function. Define a tube around the optimal process \((x^*,u^*, L^*)\) in S

$$\begin{aligned} S_*^{\varepsilon ,R}(t) = \{ (t,x,u) \in S \,: \, |x - x^*(t)| \le \varepsilon \,, \, |u - u^*(t)| \le R(t) \} \end{aligned}$$

for a.e. \(t\in [0,L^*]\). The bounded slope condition requires the existence of a measurable real-valued function \(k_S\) such that

$$\begin{aligned} (x,u) \in S_*^{\varepsilon ,R}(t), \, (\alpha ,\beta ) \in N_{S(t)}^P (x,u) \implies |\alpha | \le k_S(t) |\beta | \end{aligned}$$

In our case, \(S = [0,L^*] \times \{(x,u) \,: \, h_c(x,u) \le 0 \}\) and in the following proposition, we verify the bounded slope condition for a fixed R. An intuitive idea of why in our case this condition holds is given in Fig. 1.

Proposition 5

Let

$$\begin{aligned} E = \{(x,u) \,: \, h_c(x,u) \le 0 \} \end{aligned}$$

If

figure f

then there exist \(\varepsilon _c, K_c \in [0,+\infty )\) such that for any admissible trajectory (xuL) for problem (\(p_c\)) and for any \(s \in [0,L]\)

$$\begin{aligned} |x - x(s)| \le \varepsilon _c, \, |u - u(s)| \le \varepsilon _c, \, (\alpha , \beta ) \in N^P_E(x,u) \implies |\alpha | \le K_c |\beta | \end{aligned}$$

Notice that we prove a stronger condition than the bounded slope condition, since we show such a property in the neighbourhood of any admissible process (xuL) and not just for the optimal process \((x^*,u^*,L^*)\).

Proof

Consider the functions

$$\begin{aligned} {\left\{ \begin{array}{ll} a_i(x,u) = |\psi | - c_2 u_i^{3/2} &{} i =1,2\\ b_i(x,u) = u_m - u_i &{} i = 1,2 \\ c_i(x,u) = u_i - u_M &{} i = 1,2\\ d(x,u) = - \lambda \\ e(x,u) = \lambda -1 \end{array}\right. } \end{aligned}$$

Each of the above functions is smooth in E and by definition

$$\begin{aligned} h_c = \max (a_1,a_2,b_1,b_2,c_1,c_2,d,e) \end{aligned}$$

In the notations of theorem 3

$$\begin{aligned} \begin{aligned} G(x,u) = \{&(\lambda _{a,1} \nabla a_1 +\cdots + \lambda _e \nabla e)(x,u)\,: (\lambda _{a,1},\ldots ,\lambda _e) \in \Lambda ,\\&\text { and if } a_i \text { or... or } e < 0 \text { then } \lambda _{a,i}, \text { or... or } \lambda _{e} = 0 \text { respectively} \} \end{aligned} \end{aligned}$$
(8)

and by its application, we have

$$\begin{aligned} \partial ^L h_c(x,u) \subset G(x,u) \end{aligned}$$

We can compute explicitly the gradients of the functions:

$$\begin{aligned} \begin{matrix} \nabla a_1 = \\ \nabla a_2 = \\ \nabla b_1 = \\ \nabla b_2 = \\ \nabla c_1 = \\ \nabla c_2 = \\ \nabla d = \\ \nabla e = \end{matrix} \begin{matrix} [sgn(\psi ) &{} 0 &{} 0 &{} -c_2 \frac{3}{2}\sqrt{u_1} &{} 0 &{} 0 ] \\ [sgn(\psi ) &{} 0 &{} 0 &{} 0 &{} -c_2 \frac{3}{2}\sqrt{u_2} &{} 0 ] \\ [ 0 &{} 0 &{} 0 &{} -1 &{} 0 &{} 0 ] \\ [ 0 &{} 0 &{} 0 &{} 0 &{} -1 &{} 0 ] \\ [ 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 ] \\ [ 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 ] \\ [ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} -1] \\ [ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1] \end{matrix} \end{aligned}$$

We want to see that for any \(g = (g_1,..., g_6) \in G(x,u)\), we have \((g_4,g_5,g_6) \ne (0,0,0)\). The case \((g_4,g_5,g_6) = (0,0,0)\) can happen only in the following situations:

  1. 1.

    \(\tilde{\lambda } \nabla b_i + \tilde{\lambda } \nabla c_i = 0 \) for some \(\tilde{\lambda } > 0\);

  2. 2.

    \(\tilde{\lambda } \nabla d + \tilde{\lambda } \nabla e = 0\) for some \(\tilde{\lambda } > 0\);

  3. 3.

    \(\lambda _{a,i} \nabla a_i + \lambda _{c,i} \nabla c_i = 0\) for some \(\lambda _{a,i}, \lambda _{c_1} > 0\).

Cases 1 and 2 cannot happen. Indeed, assume for instance that case 1 holds true. Since \(\tilde{\lambda } > 0\), this is possible only if \(b_i(x,u) = c_i(x,u) = 0\), consequently \(u_i = u_m = u_M\) and this is not possible. Similar reasoning can be applied to case 2.

Now, we want to show that for (xu) sufficiently close to (x(s), u(s)) for any \(s \in [0,L]\), case 3 never occurs. Indeed, to have \(\lambda _{a,i}, \lambda _{c,i} > 0\), we need

$$\begin{aligned} {\left\{ \begin{array}{ll} u_i = u_M \\ \left( \frac{|\psi |}{c_2}\right) ^{2/3} = u_i \end{array}\right. } \end{aligned}$$
(9)

By proposition 3, we know that

$$\begin{aligned} |\psi (s)| \le \frac{c_1 M^2}{u_m} \end{aligned}$$

So, if \(u_M\) satisfies hypothesis (\(HP_{max}^1\)), then (x(s), u(s)) cannot satisfy the conditions in (9), because

$$\begin{aligned} \left( \frac{|\psi (s)|}{c_2}\right) ^{2/3} \le \left( \frac{|c_1 M^2|}{c_2 u_m}\right) ^{2/3} < u_M. \end{aligned}$$

So, if we take a constant \(\varepsilon _c > 0\) such that

$$\begin{aligned} \left( \frac{|c_1 M^2 + \varepsilon _c|}{c_2 u_m}\right) ^{2/3} < u_M - \varepsilon _c \end{aligned}$$

then, for any \(s \in [0,L]\) and for any \((x,u) \in E\) such that

$$\begin{aligned} |x - x(s)| \le \varepsilon _c, \quad |u - u(s)| \le \varepsilon _c \end{aligned}$$
(10)

we have

$$\begin{aligned} \left( \frac{|\psi |}{c_2}\right) ^{2/3} \le \left( \frac{|c_1 M^2 + \varepsilon _c|}{c_2 u_m}\right) ^{2/3} < u_M - \varepsilon _c. \end{aligned}$$

Consequently, system (9) cannot be satisfied.

Thus, for any \((\alpha ,\beta ) \in G(x,u)\) with (xu) satisfying condition (10), where \(\alpha \) considers the partial derivatives with respect to the x and \(\beta \) the partial derivatives with respect the control u, we have

$$\begin{aligned} \begin{aligned} |\alpha |&\le 1 \\ |\beta |&\ge \min \left( c_2 \frac{3}{2}\sqrt{u_i}, 1\right) \ge \min \left( c_2 \frac{3}{2}\sqrt{u_m}, 1\right) \end{aligned} \end{aligned}$$

So, by choosing

$$\begin{aligned} K_c = \max \left( \frac{2}{3c_2\sqrt{u_m}}, 1\right) \end{aligned}$$

we always have \(|\alpha | \le K_c |\beta |\). We also observe that in this situation, we always have \(|(\alpha ,\beta )| \ne 0\). For this reason, we can apply theorem 2 and consequently

$$\begin{aligned} N_E^P(x,u) \subset N_E^L(x,u) \subset \{\tilde{\lambda } (\alpha ,\beta ) \,: \, \tilde{\lambda } \ge 0, \, (\alpha ,\beta ) \in G(x,u) \} \end{aligned}$$

which concludes the proof. \(\square \)

Remark 2

The set G(xu) defined in (8) is convex and closed. This means that also the set

$$\begin{aligned} \{\tilde{\lambda } \xi \,: \, \tilde{\lambda } \ge 0, \, \xi \in G(x,u) \} \end{aligned}$$

is convex and closed. Consequently, we also have the inclusion

$$\begin{aligned} N_E^C(x,u) \subset \{\tilde{\lambda } \xi \,: \, \tilde{\lambda } \ge 0, \, \xi \in G(x,u) \}, \end{aligned}$$

where \(E = \{(x,u) \,: \, h_c(x,u) \le 0 \}\).

For any fixed \(\xi \in N_E^C(x,u)\), we use \(\xi _{\psi }\) to denote the component of \(\xi \) corresponding to \(\psi \). The notations \(\xi _{u_i}\) and \(\xi _{\lambda }\) have a similar meaning. Furthermore, in view of the proof of Proposition 5, we can observe that

$$\begin{aligned} sgn(\xi _{\psi }) = sgn(\psi ). \end{aligned}$$

4.3.2 The Adjoint System

Given \({\bar{L}} \ge L > 0 \) and a measurable function \(x: [0,L] \rightarrow {\mathbb {R}}^n)\), we define

$$\begin{aligned} x_{\bar{L}}(s) = {\left\{ \begin{array}{ll} x(s) &{} s \in [0,L] \\ x(L) &{} s \in [L, \bar{L}] \end{array}\right. }. \end{aligned}$$

We say that an admissible trajectory \((x^*,u^*,L^*)\) is an \(\varepsilon \)-local maximum for (\(p_c\)) if for any other admissible trajectory (xuL) such that

$$\begin{aligned} |L^* - L|, |x_{{\bar{L}}}(s) - x^*_{\bar{L}}(s)|, |u_{\bar{L}}(s) - u^*_{\bar{L}}(s)| \le \varepsilon \end{aligned}$$

for a.e. \(s \in [0,{\bar{L}} = \max \{L^*,L\}]\), the inequality \(L^* > L\) holds.

Theorem 4

Assume (\(HP_{max}^1\)) is satisfied and let \(\varepsilon _c\) be the constant of proposition 5. Let \((x^*,u^*,L^*)\) be an \(\varepsilon _c\)-local maximum for (\(p_c\)) and define the set of the constraints

$$\begin{aligned} E = \{(x,u) \,: \, h_c(x,u) \le 0 \}. \end{aligned}$$

Then, there exists an arc \(p = (p_{\psi },p_{\theta },p_{\mu }) \in W^{1,1}([0,L^*];{\mathbb {R}}^3)\) and a number \(\lambda _0 \in \;\{0,1\}\) satisfying the non-triviality condition

$$\begin{aligned} (\lambda _0,p(s)) \ne 0, \, \forall s \in [0,L^*] \end{aligned}$$

such that

$$\begin{aligned} {\left\{ \begin{array}{ll} p'_{\psi } = p_{\theta } \left( \frac{\lambda _1^*}{(u_1^*)^2} + \frac{\lambda _2^*}{(u_2^*)^2} \right) + \xi _{\psi } \\ p'_{\theta } = p_{\psi } (- c_1 \mu ^* \cos \theta ^*) \\ p'_{\mu } = - p_{\psi } c_1 \sin \theta ^* \\ p_{\psi }(0) = 0 \\ p_{\theta }(L^*) = 0 \\ \end{array}\right. } \end{aligned}$$
(11)

and

$$\begin{aligned} {\left\{ \begin{array}{ll} p_{\psi } c_1 \mu ^* \sin \theta ^* - p_{\theta } \psi ^* \left( \frac{\lambda _1^*}{(u_1^*)^2} + \frac{\lambda _2^*}{(u_2^*)^2} \right) - p_{\mu } (\lambda _1^* u_1^* + \lambda _2^* u_2^*) = - \lambda _0 \\ 2 p_{\theta } \psi ^* \frac{\lambda _i^*}{(u_i^*)^3} - p_{\mu } \lambda _i^* - \xi _{u_i} = 0 &{} i = 1,2 \\ -p_{\theta }\psi ^* \left( \frac{1}{(u_1^*)^2} - \frac{1}{(u_2^*)^2} \right) - p_{\mu } (u_1^* - u_2^*) -\xi _{\lambda } = 0 \end{array}\right. } \end{aligned}$$
(12)

almost everywhere in \([0,L^*]\), where \(\xi : [0,L^*] \rightarrow {\mathbb {R}}^7\) is a measurable function such that \(\xi (s) \in N_E^C(x^*(s),u^*(s))\) for a.e. \(s \in [0,L^*]\). Let

$$\begin{aligned} H(x,u,p) = p_{\psi } c_1 \mu \sin \theta - p_{\theta } \psi \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2} \right) - p_{\mu } (\lambda _1 u_1 + \lambda _2 u_2) \end{aligned}$$

be the unmaximized Hamiltonian. Then, the Weierstrass condition holds. So, for almost every \(s \in [0,L^*]\) and for every \(u \in {\mathbb {R}}^3\) such that \((x^*(s),u) \in E\) and \(|u - u^*(s)| \le \varepsilon _c\), we have

$$\begin{aligned} H(x^*(s),u,p(s)) \le H(x^*(s),u^*(s),p(s)) \end{aligned}$$
(13)

Notice that the non-triviality condition holds for every \(s \in [0,L^*]\).

Proof

We reformulate the free end-time problem (\(p_c\)) into a fixed end-time maximization by a transformation of the independent variable. So, we consider the following system

$$\begin{aligned} {\left\{ \begin{array}{ll} \min _{(u_1,u_2,\lambda ,\tau )} -\tau _a(L^*) \\ \text {Subject to} \\ \psi ' = c_1 \mu \sin (\theta ) \tau &{} \text { a.e. in } [0,L^*]\\ \theta ' = - \psi \tau \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2} \right) &{} \text { a.e. in } [0,L^*]\\ \mu ' = -\tau (\lambda _1 u_1 + \lambda _2 u_2) &{} \text { a.e. in } [0,L^*]\\ \tau _a' = \tau &{} \text { a.e. in } [0,L^*]\\ |\psi | - c_2 u_i^{3/2} \le 0 &{} i = 1,2 \,; \, \text { a.e. in } [0,L^*]\\ u_i \in [u_m,u_M] &{} i = 1,2 \,; \, \text { a.e. in } [0,L^*]\\ \lambda _1 = 1 - \lambda _2 \in [0,1] \\ \tau \in \left[ -\frac{\varepsilon _c}{L^*} + 1, \frac{\varepsilon _c}{L^*} + 1 \right] \\ \psi (L^*) = 0 \\ \theta (0) = \theta _0 \\ \mu (0) = M \,, \, \mu (L^*) = 0 \\ \tau _a(0) = 0 \end{array}\right. } \end{aligned}$$
(14)

where \(\tau \) is now a further control function. It is clear that if \((x^*,u^*,L^*)\) is a \(\varepsilon _c\)-local maximum for (\(p_c\)), then \((x^*,(u^*,\tau ^*\equiv 1))\) is a \(\varepsilon _c\)-local minimum of (14). The conclusion follows from an application of the Pontryagin’s maximum principle (see Theorem 2.1 of Clarke et al. (2010)). Indeed, in view of Proposition 5, we know that the bounded slope condition \(\varvec{BS_*^{\varepsilon ,R}}\) holds true for (14). It remains to verify the Lipschitz continuity of the function

$$\begin{aligned} f_{L}(x,u, \tau ) = \left( c_1 \mu \sin (\theta ) \tau , - \psi \tau \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2} \right) , -\tau (\lambda _1 u_1 + \lambda _2 u_2), \tau \right) \end{aligned}$$

for \((\tilde{x},\tilde{u})\) in the set

$$\begin{aligned} T_c(s) = \{(x,u) \,: \, |x - x^*(s)|,|u - u^*(s)| \le \varepsilon _c \}. \end{aligned}$$

for a.e. \(s \in [0,L^*]\). Thanks to Proposition 3, we know that \(|\psi ^*|\) and \(\mu ^*\) are bounded. This means that \(T_c(s)\) is a compact set. Moreover, \((x,0) \notin T_c(s)\) for any s. Consequently, \(f_L\) is smooth in \(T_c(s)\) and the Lipschitz continuity follows from the compactness of \(T_c(s)\). \(\square \)

5 Properties of the Adjoint Arc

In this section, we analyse the behaviour of the adjoint arc p in Theorem 4, in order to gain more information on the optimal control. From now on, we assume that the initial inclination of the elastic rod is

$$\begin{aligned} \theta _0 = \frac{\pi }{2}. \end{aligned}$$

Remark 3

In the notations of Theorems 3 and 4, for any \(s \in [0,L^*]\) such that \(h_{c}(x^*(s),u^*(s)) = 0\), given \(\xi (s) \in N_{E}^C(x^*(s),u^*(s))\), we have

$$\begin{aligned} \xi = \tilde{\lambda } \begin{bmatrix} - (\lambda _{a,1} + \lambda _{a_2}) \\ 0 \\ 0 \\ 0 \\ - \lambda _{a,1} c_2 \frac{3}{2}\sqrt{u_1} - \lambda _{b,1} + \lambda _{c,1} \\ - \lambda _{a,2} \frac{3}{2}\sqrt{u_2} - \lambda _{b,2} + \lambda _{c,2} \\ \lambda _{d} - \lambda _e \end{bmatrix} \end{aligned}$$

with \(\tilde{\lambda } \ge 0\) and \((\lambda _{a,1},..., \lambda _{e}) \in \Lambda \). Recalling Remark 2, it follows from Proposition 4 that

$$\begin{aligned} \xi _{\psi } \le 0. \end{aligned}$$

Moreover, recalling the proof of Proposition 5, if (\(HP_{max}^1\)) holds and \(u_i = u_M\), then \(\xi _{u_i} \ge 0\).

Proposition 6

Let \((x^*,u^*,L^*)\) be an optimal trajectory for problem (\(p_c\)). Let p be an adjoint arc which satisfies system (11) and assume that hypothesis (\(HP_{max}^1\)) and (\(HP_2c\)) hold. Then there exists \({\bar{s}} \in [0,L^*]\) such that if \({\bar{s}} < L^*\), then

$$\begin{aligned} {\left\{ \begin{array}{ll} p_{\theta }(s) > 0 &{} s \in [0,{\bar{s}}) \\ p_{\psi }(s) = p_{\theta }(s) = 0 &{} s \in [{\bar{s}},L^*] \end{array}\right. }. \end{aligned}$$

Otherwise, if \({\bar{s}} = L^*\), only the condition on \(p_{\theta }\) holds, that is, \(p_{\theta }(s) > 0\) for \(s \in [0,L^*)\).

Proof

Let us define the functions

$$\begin{aligned} \begin{aligned} a&= \left( \frac{\lambda ^*_1}{(u_1^*)^2} + \frac{\lambda ^*_2}{(u_2^*)^2} \right) \\ b&= -c_1 \mu ^* \cos \theta ^* \end{aligned}. \end{aligned}$$

Since \((x^*,u^*)\) is fixed, a and b can be regarded as the time-dependent functions appearing in (11). Furthermore, in view of hypothesis (\(HP_2c\)), it follows from Proposition 4 that \(a,b > 0\) in \([0,L^*)\). Then the adjoint system (11) can be written as

$$\begin{aligned} {\left\{ \begin{array}{ll} p_{\psi }' = p_{\theta } a + \xi _{\psi }\\ p_{\theta }' = p_{\psi } b \\ p_{\psi } (0) = 0 \\ p_{\theta }(L^*) = 0 \end{array}\right. } \end{aligned}$$
(15)

where \(\xi (s) \in N_{E}^C(x^*(s),u^*(s))\).

As a first step in the proof of Proposition 6, we will prove the following claim.

Claim: there does not exist \(s_0 \in [0,L^*)\) such that

$$\begin{aligned} p_{\psi }(s_0) \le 0 \,, \, p_{\theta }(s_0) < 0. \end{aligned}$$
(16)

Let us define

$$\begin{aligned} s_1 = \min \{ s \in [s_0,L^*] \,: \, p_{\theta }(s) \ge 0 \} \end{aligned}$$

It follows from the assumption (16) that \(s_1 > s_0\). If \(s_1 \le L^*\), then \(p_{\theta }(s) < 0\) for \(s \in [s_0, s_1)\) and \(p_{\theta }(s_1) = 0\). Hence, it follows from Proposition 4, the condition (16) and Remark 3 that

$$\begin{aligned} p_{\psi }(s_1) = p_{\psi }(s_0) + \int _{s_0}^{s_1} p_{\theta }(\sigma ) a(\sigma )\textrm{d}\sigma +\int _{s_0}^{s_1} \xi _{\psi }(\sigma )\textrm{d}\sigma < 0 \end{aligned}$$

and, by continuity of \(p_{\psi }(\cdot )\), there exists \(\delta > 0\) such that

$$\begin{aligned} p_{\psi }(s) < 0 \qquad \textrm{for}\;\textrm{all}\; s \in (s_1 - \delta , s_1 + \delta ). \end{aligned}$$
(17)

However, by appealing again to Proposition 4, condition (16) and (17), one has that

$$\begin{aligned} p_{\theta }(s_1) = p_{\theta }(s_1 - \delta ) + \int _{s_1 - \delta }^{s_1} p_{\psi }(\sigma ) b(\sigma ) \textrm{d} \sigma < 0, \end{aligned}$$

which contradicts the relation \(p_{\theta }(s_1) = 0\). Hence the condition \(s_1\le L^*\) is not satisfied, implying that \(s_1 = +\infty \). But even this situation cannot occur since in particular it implies \(p_{\theta }(L^*) < 0\), contradicting the boundary condition \(p_{\theta }(L^*) = 0\). Hence, the condition (16) cannot occur and this proves the claim.

The main implication of the claim is to rule out the possibility to have the initial condition \(p_{\theta }(0) < 0\). Indeed, this follows immediately from an application of the claim statement combined with the results of Proposition  4. The following situations remain then to be studied:

Case 1: \(p_{\theta }(0) = 0\).

In this case, we will show that one can only have \(p_{\psi } \equiv p_{\theta } \equiv 0\). To achieve this goal, we consider the linear part of (14):

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _s p_{\psi } = p_{\theta } a \ \partial _s p_{\theta } = p_{\psi } b \end{array}\right. } . \end{aligned}$$
(18)

Let M be the fundamental matrix solution to (15), with \(M(0) = \text {Id}\). Then \(M_{i,j}\) are non-negative and increasing in \([0,L^*]\). Since by remark 3\(\xi _{\psi } \le 0\) and since \(a,b > 0\) in \([0,L^*)\), then by using the Duhamel formula for Equation (14), we obtain \(p_{\psi }, p_{\theta } \le 0\) in \([0,L^*]\). Then, by the dynamics (14) and Claim 1, the only possibility is to have \(p_{\psi } \equiv p_{\theta } \equiv 0\).

By the dynamics (15) and the claim, the only possibility is to have \(p_{\psi } \equiv p_{\theta } \equiv 0\).

Case 2: \(p_{\theta }(0) > 0\).

Let us set

$$\begin{aligned} {\bar{s}} = \min \{s \in [0,L^*] \,: \, p_{\theta }(s) \le 0 \}. \end{aligned}$$

Then one has that \({\bar{s}} > 0\) and \(p_{\theta }({\bar{s}}) = 0\). If \({\bar{s}} = L^*\), then \(p_{\theta }(s)>0\) for all \(s\in [0,L^*)\) and we have nothing to prove. If \({\bar{s}} < L^*\), by arguing again as in the proof of the claim, one can show that it is not possible to have \(p_{\psi }({\bar{s}}) > 0\), because this would imply \(p_{\theta }({\bar{s}}) > 0\), contradicting the definition of \({\bar{s}}\). Furthermore, one cannot obtain \(p_{\psi }({\bar{s}}) < 0\). Indeed, if \(p_{\psi }({\bar{s}}) < 0\), then by continuity of the adjoint arc \(p_{\psi }(\cdot )\), there exists \(\delta > 0\) such that \(p_{\psi }(s) < 0\) for any \(s \in [{\bar{s}} - \delta , {\bar{s}} + \delta ]\). It then follows that

$$\begin{aligned} p_{\theta }({\bar{s}} + \delta ) = \int _{{\bar{s}}}^{{\bar{s}} + \delta } p_{\psi }(\sigma ) b(\sigma ) \textrm{d} \sigma < 0 \end{aligned}$$

However, the latter inequality contradicts the statement of the claim, providing a contradiction. Hence, the only possibility is that \(p_{\psi }({\bar{s}}) = 0\) and by applying the arguments of Case 1 in the interval \([{{\bar{s}}}, L^*]\), one has that \(p_{\psi } \equiv p_{\theta } \equiv 0\) in \([{\bar{s}},L^*]\) and that \(p_{\theta }(s)>0\) for all \(s\in [0,{\bar{s}})\). This completes the proof. \(\square \)

Proposition 7

Let \((x^*,u^*,L^*)\) be an optimal trajectory for problem (\(p_c\)). Let \(p=(p_\psi , p_\theta , p_\mu )\) be an adjoint arc which satisfies system (11) and assume that hypotheses (\(HP_{max}^1\)) and (\(HP_2c\)) hold true. Then

$$\begin{aligned} p_{\mu }(s) \ge 0 \qquad \textrm{for}\;\textrm{all}\;s\in [0,L^*]. \end{aligned}$$

Proof

We will structure the proof of the proposition in three main steps.

Step 1: One has that \(p_{\mu }(0) \ge 0\). Assume by contradiction that \(p_{\mu }(0) < 0\). By continuity of \(p_{\mu }(\cdot )\), there exist constants \(\varepsilon ,\delta > 0\) such that

$$\begin{aligned} \begin{aligned} p_{\psi }(s)c_1 \mu (s) \sin \left( \theta (s)\right)&> - \varepsilon \\ \Big ( p_{\mu }(s)(\lambda _1(s) u_1(s) + \lambda _2(s) u_2(s)) \le \Big ) \qquad p_{\mu }(s)u_m&< -\varepsilon \end{aligned} \end{aligned}$$

for a.e. \(s\in [0,\delta ]\). Hence, by using the first equation in (12), one has that

$$\begin{aligned} p_{\psi }c_1 \mu \sin \theta - p_{\mu }(\lambda _1 u_1 + \lambda _2 u_2) > 0, \qquad \text {a.e. in }\; [0,\delta ]. \end{aligned}$$

On the other hand, it follows from Proposition 6 that \(p_{\theta } \ge 0\) and from Proposition 4 that \(\psi ^*\le 0\) for all \(s\in [0,L^*)\). So, by using again the first equation of system (12), one obtains the relation

$$\begin{aligned} p_{\psi }c_1 \mu \sin \theta - p_{\mu }(\lambda _1 u_1 + \lambda _2 u_2) \le 0\qquad \mathrm {a.e.}\; s\in [0,L^*] \end{aligned}$$

and obtains a contradiction. Consequently, this shows that \(p_{\mu }(0) \ge 0\).

Step 2: Let us set

$$\begin{aligned} A = \{ s \in [0,L^*] \,: \, p_{\mu }(s) \le 0 \}. \end{aligned}$$

Then \(p_{\psi } \le 0\) for a.e. \(s \in A\).

Indeed, let us recall again that Proposition 6 asserts that \(p_{\theta } \ge 0\) for all \(s\in [0,L^*)\), while Proposition 4 asserts that \(\psi ^*\le 0\) for all \(s\in [0,L^*)\). It then follows from the first equation of system (12) that

$$\begin{aligned} p_{\psi }c_1 \mu \sin \theta - p_{\mu }(\lambda _1 u_1 + \lambda _2 u_2) \le 0, \qquad \mathrm {a.e.}\; s\in A \end{aligned}$$

that is

$$\begin{aligned} p_{\psi } \le p_{\mu }\frac{\lambda _i u_1 + \lambda _2 u_2 }{c_1 \mu \sin \theta } \le 0, \qquad \mathrm {a.e.}\; s\in A. \end{aligned}$$

This shows the assertion of Step 2.

Step 3: \(p_{\mu }(s) \ge 0\) for all \(s\in [0,L^*]\).

Assume that there exists \(s_0 \in [0,L^*]\) such that \(p_{\mu }(s_0) < 0\) and define

$$\begin{aligned} s_1 = \sup \{ s < s_0 \,: \, p_{\mu }(s) \ge 0 \}. \end{aligned}$$

Since \(p_{\mu }(0) \ge 0\), \(s_1 \in [0,s_0)\) and \(p_{\mu }(s_1) = 0\). In view of Step 2, one has that \(p_{\psi }(s) \le 0\) for a.e. \(s \in [s_1,s_0]\). Using the third equation in system (11), we obtain the inequality

$$\begin{aligned} p_{\mu }(s_0) = - \int _{s_0}^{s_1}p_{\psi } c_1 \sin {\theta } \ge 0 \end{aligned}$$

and reach a contradiction. This completes the proof of Proposition 7. \(\square \)

The non-negativity of \(p_{\theta }\) and \(p_{\mu }\) are important in the determination of the optimal control. In the following, we use \({\mathcal {L}}\) to denote the Lebesgue measure in \({\mathbb {R}}\).

Theorem 5

Let \((x^*,u^*,L^*)\) be an optimal trajectory for problem (\(p_c\)). Assume that hypothesis (\(HP_2c\)), (\(HP_{max}^1\)) hold true. Then, for a.e. \(s \in [0,L^*]\), one has

$$\begin{aligned} u_1^*(s) = u_2^*(s) = \max \left\{ u_m, \, \left( \frac{|\psi ^*(s)|}{c_2} \right) ^{2/3} \right\} . \end{aligned}$$
(19)

Proof

It follows from the control constraints of the optimal control problem that

$$\begin{aligned} u_m \le u^*_i \le u_M\;\; \textrm{and} \;\; u_i^* \ge \left( \frac{|\psi ^*|}{c_2}\right) ^{2/3}\quad \textrm{for}\; i = 1,2. \end{aligned}$$

As a first step, we prove that

$$\begin{aligned} \lambda _i^*(s) > 0 \implies u_i^*(s) \in \left\{ \max \left\{ u_m, \left( \frac{|\psi ^*(s)|}{c_2}\right) ^{2/3} \right\} , u_M\right\} , \end{aligned}$$
(20)

that is, \(u_i^*\) cannot be an internal point of the admissible range of control values. Assume that this is not true for \(i = 1\). So, there exists a set \(D \subset [0,L^*]\) such that \({\mathcal {L}}(D) > 0\) and

$$\begin{aligned} \lambda _1^*(s)>0 \,, \, u_1^*(s) \notin \left\{ \max \left\{ u_m, \left( \frac{|\psi ^*|}{c_2}\right) ^{2/3} \right\} , u_M\right\} \end{aligned}$$

for every \(s \in D\). Let p be an adjoint arc given by Theorem 4. From system (12) we have

$$\begin{aligned} \frac{\partial }{\partial u_1}H(x^*,u^*,p) = -p_{\mu } \lambda _1^* -2 p_{\theta }(-\psi ^*) \frac{\lambda _1^*}{(u_1^*)^3} = 0 \end{aligned}$$
(21)

in a.e. D. So, \(u_1^*(s)\) is a.e. a critical point for the Hamiltonian. Let \({\bar{s}}\in [0,L^*]\) be the time which appears in the statement of Proposition 6. It then follows from Propositions 4 and 6 that

$$\begin{aligned} \frac{\partial ^2}{\partial u_1^2} H(x^*(s),u^*(s),p(s)) = 6 p_{\theta }(s) (-\psi (s)) \frac{\lambda _1^*(s)}{(u_1^*(s))^4} {\left\{ \begin{array}{ll} > 0 &{} \text { a.e. in } [0,\bar{s}) \cap D \\ = 0 &{} \text { a.e. in } [\bar{s},1] \cap D \end{array}\right. }\nonumber \\ \end{aligned}$$
(22)

If \({\mathcal {L}}(D \cap [0,{\bar{s}}]) > 0\), then conditions (21)–(22) imply that \(u_1^*(s)\) is a minimum for H in a set of positive measure, contradicting condition (13). Therefore, we can assume \(D \subset ({\bar{s}},L^*]\). In view of condition (21), we observe that \(p_{\mu } = 0\) in D. Hence this implies \(p_{\psi } \equiv p_{\theta } \equiv p_{\mu } \equiv 0\) in D. By using the first equation of system (12), one obtains the relation \(\lambda _0 = 0\). However, this implies that the non-triviality condition is violated a.e. in D, reaching a contradiction. Hence it follows that the relation (20) has to be satisfied.

We can now prove the thesis. Assume by contradiction that there exists a set E such that \({\mathcal {L}}(E) > 0\) and \(\lambda _1(s) > 0\), \(u_1(s) = u_M\) for a.e. \(s \in E\). It follows from the second equation of system (12) and from the inequality \(\xi _{u_1}(s)\ge 0\) a.e \(s\in [0,L^*]\) (see Remark 3) that

$$\begin{aligned} 2 (-p_{\theta }) (-\psi ^*) \frac{\lambda _1^*}{(u_1^*)^3} - p_{\mu } \lambda _1^* \ge 0 \end{aligned}$$

a.e. in E. On the other hand, it follows from Propositions 6 and 7 that the previous inequality holds true only if \(p_{\mu } \equiv p_{\theta } \equiv 0\) a.e. in E. Then, by using again Proposition 6, this implies that also \(p_{\psi } \equiv 0\) in \([\textrm{ess}\inf (E), L^*]\). Hence, by appealing again to the first equation of system (12), there exists a set with positive Lebsgue measure such that one obtains the relation \(\lambda _0 = 0\). This contradicts the non-triviality condition appearing in the necessary conditions. The case regarding \(u_2\) is analogous. This completes the proof. \(\square \)

Remark 4

Problem (\(p_c\)) is a relaxation of the original Problem (P). Nevertheless, Theorem 5 states that the optimal control is not relaxed, in the sense that \(u_1^* = u_2^*\) a.e. in \([0,L^*]\). Consequently, Theorem 5 shows the equivalence between problems (\(p_c\)) and (P), since in both cases the optimal control (19) leads to the same optimal length \(L^*\).

Remark 5

All the logical passages used until now are valid independently of the normality of the problem. By the first equation of system (12), if the adjoint arc p is null, then also the multiplier \(\lambda _0\) is null. This excludes the possibility to have a trivial p and a non-trivial \(\lambda _0\). However, the vice-versa, and consequently the normality of the problem itself, is not immediate. A multiplier \(\lambda _0 \ne 0\) can keep track of any rescaling operation that involves the adjoint arc p. As we will see in the next section, this leads to a procedure for the numerical computation of the adjoint trajectories.

Remark 6

Problem \((P_c)\) admits only regular extremals. Indeed, by Proposition 4, any admissible trajectory for problem \((P_c)\) has \(\psi \ne 0\) for \(s \ne L\). Let \(D \subset [0,L^*]\) be a subset of positive measure in which some extremal is singular. By the second condition of system (12), any adjoint arc p that corresponds to the singular extremal must have \(p_{\theta }, p_{\mu } \equiv 0\) in D. By Proposition 6, this implies also \(p_{\psi } \equiv 0\) in D, hence also \(\lambda _0 = 0\) in D by the first equation of (12). These relations do not satisfy the non-triviality condition \((p,\lambda _0) \ne 0\) for all \(s \in [0,L^*]\), so Problem \((P_c)\) does not admit singular extremals.

Remark 7

In the proofs of Proposition 7 and Theorem 5, we have used the condition that along the optimal trajectory, the Hamiltonian H is constantly equal to \(- \lambda _0\). This equality holds true because the dynamic used for problem (P) is autonomous, that is, f, \(h_{P,1}\), \(h_{P,2}\) and \({\mathcal {U}}\) does not depend explicitly from \(s \in [0,L]\). This “time" independency follows from our modelling assumptions. For instance, we considered the volume density \(\rho _3\) and Young’s modulus E constant all along the shoot. The inclusion of a time-dependent dynamic would require further analysis of the sign of H.

Fig. 2
figure 2

Numerical integration of system (23) (ac) and optimal radius (d). As stated in Proposition 4, both \(\psi \) and \(\theta \) are increasing. The former is always negative, while the latter is always in the interval \([\pi /2,\pi )\). As displayed in Fig. 2d, the radius always decreases. Around 0.8 m from the base the radius becomes constant. This is because, at that point, the optimal \(u^*\) reaches the minimal value \(u_m\). Since \(|\psi |\) is decreasing, from that point on, \(u^*\) is constantly \(u_m\)

Table 1 Parameters used for the numerical integration of system (23)

6 Simulations

If conditions (\(HP_{max}^1\)) and (\(HP_2c\)) hold true, then Theorem 5 determines the optimal control in feedback form. As discussed in Remark 4, this result prescribes \(u_1^* = u_2^*\) a.e., so we can consider \(\lambda _1^* \equiv 1\) and problem (\(p_c\)) becomes equivalent to (P). Therefore, the optimal trajectory solves the boundary value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \psi ' = c_1 \mu \sin (\theta ) \\ \theta ' = - \frac{\psi }{(u^*)^2} \\ \mu ' = - u^* \\ \psi (L^*) = 0 \\ \theta (0) = \frac{\pi }{2} \\ \mu (0) = M \,, \, \mu (L^*) = 0 \\ \hline u^* = \max \left( u_m, \left( \frac{|\psi |}{c_2}\right) ^{2/3} \right) \\ c_1 = \frac{4 g \rho _3}{E} \,, \, c_2 = \frac{ \bar{\sigma } }{E} \end{array}\right. } \end{aligned}$$
(23)

In our parameter pool (see Table 1), we do not have the value of the parameter \({\bar{\sigma }}\). On the other hand, we assume to know the value of the maximal length \(L^*\). So, to integrate system (23), we have to determine those values of \(\psi (0)\) and \(c_2\) such that \(\psi (L^*) = \mu (L^*) = 0\). To achieve these endpoint conditions, we wrote a Matlab script which employs the function bvp5c to solve a boundary value problem. The parameters displayed in Table 1 lead to a constant \(c_1\) which satisfies condition (\(HP_2c\)). Indeed

$$\begin{aligned} c_1 = \frac{4g\rho _3}{E} = 13.08 \cdot 10^{-6} < 6 \cdot 10^{-3} \sim \frac{\pi }{2} \left( \frac{u_m}{M} \right) ^3 \end{aligned}$$

Since we do not need to set any value of \(u_M\), conditions (\({HP_{max}}\)) and (\(HP_{max}^1\)) are immediately satisfied.

Fig. 3
figure 3

The adjoint arc (ac) and the Hamiltonian (d). Each graph displays four iterations of the process expressed by Eq. (26). The iterations show that each component of the adjoint arc and the Hamiltonian are converging at least pointwise. We notice that \(p_{\theta }\) and \(p_{\mu }\) are always non-negative. So, the adjoint arc is following the results of Propositions 6 and 7, and the iterations of the Hamiltonian are converging to a function constantly equal to \(-1\)

6.1 Simulation of the Adjoint System

The normal cone in the dynamic of the adjoint arc makes the simulation of the adjoint system a non-trivial procedure. However, the second equation of system (12) and Remark 3 suggest the following heuristic iterative process to generate the component \(\xi _{\psi }\) of the normal vector.

Define \(s_0 \in [0,L^*]\) the value in which

$$\begin{aligned} u^*(s_0) = \left( \frac{|\psi |(s_0)}{c_2}\right) ^{2/3} = u_m. \end{aligned}$$

Assume that Eq. (11) has been rescaled with a piecewise constant function \(\gamma _0\) in \([0,L^*]\), so that

$$\begin{aligned} \xi _{\psi ,0}(s) = \frac{\xi _{\psi }(s)}{\gamma _0(s)} = {\left\{ \begin{array}{ll} -1 &{} s \in [0,s_0] \\ 0 &{} s \in [s_0,L^*] \end{array}\right. }. \end{aligned}$$

The value of \(s_0\) can be estimated by integrating system (23), and for \(t > s_0\) the trajectory x is not activating the constraint on \(|\psi |\), that is, \(|\psi | < u_m^{3/2} c_2\). Consequently, \(\xi _{\psi ,0} \equiv \xi _{\psi } \equiv 0\) in \([s_0,L^*]\). Denote with \(p_{\psi ,0}, p_{\theta ,0}\) the solutions to the boundary value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} p'_{\psi } = \frac{p_{\theta }}{u^*} - \xi _{\psi ,0} \\ p'_{\theta } = p_{\psi }(-c_1 \mu \cos \theta ) \\ p_{\psi }(0) = 0 \\ p_{\theta }(L^*) = 0 \end{array}\right. }. \end{aligned}$$
(24)

Here, \(p_{\psi ,0}\) and \(p_{\theta ,0}\) are the first two components of the adjoint arc p solution to the adjoint system (11) rescaled by \(\gamma _0\). That is, if p is a solution of the adjoint system (11), then \((p_{\psi }/\gamma _0, p_{\theta }/\gamma _0)\) is a solution of (24) and vice-versa. \(p_{\mu }\) is not considered by system (24) because it does not affect the behaviour of the other components of the adjoint arc and because we do not have any information on its boundary values. However, we can retrieve the values of \(p_{\mu ,0} = p_{\mu }/\gamma _0\) in \([0,s_0]\) by considering the second equation of system (12). So, we have

$$\begin{aligned} p_{\mu ,0} = 2 p_{\theta ,0}\psi \frac{1}{(u^*)^3} - \xi _{u,0} \end{aligned}$$

with \(\xi _{u,0} = \xi _{u}/\gamma _0\).

In the interval \([0,s_0]\) we know that without the rescaling by \(\gamma _0\) we have (see Remark 3)

$$\begin{aligned} \begin{aligned} \xi _{\psi }&= -\tilde{\lambda } \\ \xi _{u}&= -\tilde{\lambda } c_2 \frac{3}{2} \sqrt{u^*} \end{aligned} \end{aligned}$$

since \(\xi _{\psi }/\gamma _0 = \xi _{\psi ,0}\), we deduce that in \([0,s_0]\)

$$\begin{aligned} \xi _{u,0} = \xi _{\psi ,0} \cdot c_2 \frac{3}{2} \sqrt{u^*} \end{aligned}$$

This allows us to estimate \(p_{\mu ,0}\) in \([0,s_0]\).

We now make a further assumption: the problem is normal, that is in the first equation of system (12) we have \(\lambda _0 = 1\). Hence, we can estimate \(\gamma _0\) in \([0,s_0]\), since we have

$$\begin{aligned} p_{\psi ,0}c_1\mu \sin \theta - p_{\theta ,0}\frac{1}{(u^*)^2} - p_{\mu ,0} u^* = - \frac{1}{\gamma _0} \end{aligned}$$
(25)

Of course, this reasoning works only if we assume that the rescaling function \(\gamma _0\) is at most a piecewise constant function. However, we can reiterate this procedure considering

$$\begin{aligned} \begin{aligned} \xi _{\psi ,i}&= - \gamma _{i-1} \xi _{\psi ,i-1} \\ \xi _{u,i}&= \xi _{\psi ,i} c_2 \frac{3}{2} \sqrt{u^*} \\ p_{\mu ,i}&= 2 p_{\theta ,i}\psi \frac{1}{(u^*)^3} - \xi _{u,i} \end{aligned} \end{aligned}$$
(26)

and taking \(p_{\xi ,i}\), \(p_{\theta ,i}\) as the solutions to system (24) with \(\xi _{\psi ,i}\) instead of \(\xi _{\psi ,0}\), for \(i = 1,2,...\).

7 Results

The numerical solution of system (23) is displayed in the graphs of Fig. 2. The optimal trajectory respects the conditions of Proposition 6 and 7, as we expect since condition (\(HP_2c\)) is satisfied. In particular, \(\psi \) is negative and increasing, which means that \(|\psi |\) is decreasing. By Theorem 5, the optimal control is directly proportional to \(|\psi |\), consequently the optimal radius \(\sqrt{u^*}\) is decreasing until it reaches the value \(u_m\).

Regarding the simulation of the adjoint system, as displayed in Fig. 3, the sequence of the adjoint arcs \((p_{\xi ,i}, p_{\theta ,i},p_{\mu ,i})\) is converging to some function \((p_{\xi },p_{\theta },p_{\mu })\) and the Hamiltonian is converging to the constant value \(-1\). Again, the simulated trajectories respect the conditions of propositions 6 and 7. So, we observe a positive \(p_{\theta }\) in \([0,L^*)\) and a non-negative \(p_{\mu }\).

8 Discussion

The control u determines the rate of mass decrease \(\mu '\) and without the constraint

$$\begin{aligned} u \ge \left( \frac{|\psi |}{c_2} \right) ^{2/3}, \end{aligned}$$
(27)

it can assume any value in the interval \([u_m,u_M]\). Since we require \(\mu (0) = M\) and \(\mu (L) = 0\), the lower is u, the slower \(\mu \) decreases, the larger L is. In this situation, the optimal strategy would be to take the lowest value u allowed, that is \(u_m\). This reasoning motivates the intuition of relation (19) for the optimal control since in (P) the constraint (27) gives a lower bound for u.

From the relations

$$\begin{aligned} \begin{aligned} u&= R^2; \\ \psi&= - R^4 \theta ', \end{aligned} \end{aligned}$$

where we recall that R is the radius of the cross-section, we can formulate equation (19) as follows:

$$\begin{aligned} R = \frac{c_2}{|\theta '|}. \end{aligned}$$
(28)

Equation (28) gives a relation between the curvature \(|\theta '|\) and the radius of the cross-section R. Rephrasing, if we assume that searcher shoots grow optimizing their length, then their cross-section is regulated with a feedback control system. Indeed, at each point of the shoot, the cross-section, together with other physical parameters and forces like gravity, influences the curvature of the shoot. Then, as expressed by equation (28), the resulting curvature provides feedback on the cross-section: the more the shoot deviates from being a straight line, the thinner the cross-section. This closes the feedback loop, which is repeated for the successive points.

The feedback control mechanism is common in biological systems and in particular in plants (Meroz 2021). Consider for instance the gravitropic mechanism. In general, plants are able to perceive their local inclination and their local curvature through some specialized cells (Chauvet et al. 2019; Moulia et al. 2021). This information is compared with a target inclination and a target curvature (Meroz and Silk 2020), inducing a flux of hormones to regulate growth and posture (Meroz et al. 2019; Moulton et al. 2020). The plant then attains a new shape, which has different local inclinations and curvatures, and the cycle is repeated.

Further improvements in the modelling can be achieved by considering for instance (i) variable volume density and Young’s modulus (ii) the mass of the leaves. In addition, \(c_2\) is a dimensionless constant. From Eq. (28), we observe that it is equal to the product between the radius R of the cross-section and the curvature \(|\theta '|\). A comparison with experimental data on radius and curvature would improve the accuracy of the model.

In addition to a deeper insight in plant’s ecological behaviour, our study can address other research fields. In robotics, for instance, Euler–Bernoulli beam models are used to design soft robots like gripping hands (Zhou et al. 2015) or arms (Olson 2020; Sipos and Várkonyi 2020). In the latter case in particular, the problem of the longest self-supporting structure is studied from a stability point of view. A comparison between this kind of results with our study constitutes a stimulating possible research direction.

9 Conclusion

Control theory has an extremely wide range of applications, from the design of mechanical devices to physics, economics and biology. Starting from a physical model of a searcher shoot based on the Euler–Bernoulli theory of elastic rods, we used the optimal control theory to study the behaviour of the radius for the maximisation of the length. To achieve this goal, we formulated system (P), which resulted to be a boundary value problem with nonlinear dynamics and constraints on the state variable. Our approach to this problem consisted of the use of relaxation to convexify the dynamic and the application of the Pontryagin Maximum Principle for mixed state-constrained systems. The resulting optimal control expressed in Theorem 5 proves the equivalence between the original problem (P) and the relaxed one (\(p_c\)). Moreover, it gives a relation between the radius and the inverse of the curvature through the dimensionless constant \(c_2\).