Abstract
The characterization of the self-supporting slender structure with the furthest length is of interest both from a mechanical and biological point of view. Indeed, from a mechanical perspective, this classical problem was developed and studied with different methods, for example using similarity solutions and stable manifolds. However, none of them led to a complete analytical solution. On the other hand, plant structures such as tree branches or searcher shoots in climbing plants can be considered elastic cantilevered beams. In this paper, we formulate the problem as a non-convex optimisation problem with mixed state constraints. The problem is solved by analysing the corresponding relaxation. With this method, it is possible to obtain an analytical characterization of the cross-section
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In the last decades, there has been an increasing interest in plant modelling. Indeed, recent studies on how plants perceive and react to the external environment (see Chauvet et al. 2019; Bastien et al. 2013; Moulia et al. 2019; Meroz et al. 2019 for instance) led to a deeper insight into the plant growth mechanism. Current models consider proprioception (Bastien et al. 2013), internal fluxes of hormones (Moulton et al. 2020) or memory in the elaboration of the external cues (Meroz et al. 2019). Plant self-supporting structures are modelled like morphoelastic rods, whose curvatures change in time accordingly to the plant sensing activity. Furthermore, plants exhibit a great variability of the biomechanical properties intra and inter species (Rowe et al. 2004). In particular, climbing plants are a clear example of this structural variety. Consider the species Condylocarpon guianense, a common liana widely found in the flora of French Guyana. C. guianense is a twining climbing plant, which means that it reaches the canopy of the forest by twining around the branches and the trunks of its hosts. Several studies on its structure (see for instance Rowe and Speck 1996; Rowe et al. 2004; Rowe and Speck 2015 for a biological insight, or Vecchiato et al. (2023) for a modelling point of view) have revealed that in different growth stages, it changes the thickness and the nature of the layers that form its stem and consequently, it changes its stiffness. More specifically, the plant is stiffer when and where it is developing a self-supporting state, while it displays a less dense material and a thicker compliant cortex when attached to a support.
Thanks to their wide adaptability, plants became a source of inspiration for new technologies and robots (Fiorello et al. 2020; Mazzolai et al. 2014). For instance, the movement of the roots into the ground has inspired the development of new ways for soil exploration (Mazzolai 2017). And again, the efficient strategies to attach their bodies to external supports have led to the development of a robot that imitates in essence the wrapping movement and the stem stiffening of tendrils (Meder et al. 2022). In particular, in this paper, we are interested in modelling the self-supporting structures developed by climbing plants. These structures are called searcher shoots and are generated by the plant in order to explore and find support. The mechanics of the searcher shoots is an interesting and challenging study (Hattermann et al. 2022) because they exhibit both active and passive movements and must find a compromise between rigidity and flexibility to explore the surrounding environment and navigate obstacles and supports. In particular, we want to use the tools of optimal control theory to better understand how the mass is distributed along a searcher shoot. More precisely, for a given amount of mass, we want to find the best way to distribute it so as to maximize the length of the shoot and not exceed a given amount of mechanical stress. To achieve this goal, we formulate an optimal control problem for a time (length) maximization subject to mixed-state constraints. This optimization problem belongs to a classical category of problems on beam buckling. In particular, Keller et al. (1960) investigated the shape of the column that has the largest buckling load. From that work, further studies and methods were developed to solve the problem of the tallest column. Of particular note in this line of research are the works of Farjoun and Neu (2005), and Wei et al. (2012). In the former work, a symmetry of the dynamical system is employed to solve the boundary value problem related to maximizing height. In the latter work, the same technique is used to solve the problem of the tree branch with the furthest reach. In that work, the problem’s solution is studied analytically towards the tip, while the whole behaviour is displayed only via numerical simulations. Here, the problem is slightly different, since we look for a length maximization. Moreover, we employ a deeper study of the necessary conditions for the solution optimality to characterize analytically the optimal radius of the cross-section.
This work is structured in the following way. In Sect. 2, we use the theory of elastic rods to derive the differential equation of the mechanics for the searching shoot. Then, we couple this equation with the boundary and stress constraints, and a cost function. In this way, we state the optimal control problem on length maximization. In Sect. 3, we extend the dynamic by convexification of the velocities set, therefore allowing for a larger set of velocities. This relaxation of the problem allows us to prove an optimal solution’s existence. In Sect. 4, we reformulate the optimal control problem and we find a set of necessary conditions for the optimal trajectory. Then, in Sect. 5 we study the corresponding adjoint system and obtain a feedback formula for the optimal control \(u^*\). Finally, in Sect. 6, we make some numerical simulations for the optimal trajectory and the adjoint arcs based on experimental data. In those simulations, we assume that the sample in exam is maximizing its length following the principles stated in our model. This allows us to consider the length of the sample as the maximal length \(L^*\) of the optimization problem. Then, we estimate the constants of the model in order to make it fit the optimization framework. In the last sections, we discuss the results of these simulations and the analytical optimal solution. Despite the analysis of the problem is rather technical, we obtain as final result a simple relation between the curvature \(\theta '\) of the stem and the radius R of the cross-section:
Such an equality means that the stem is thinner, hence more flexible, where the curvature is higher.
2 Derivation of the Model
We model the searcher shoot as an inextensible and unshearable elastic rod whose centerline is confined in a plane. Let be \(\{e_1,e_2,e_3\}\) an othonormal base for \({\mathbb {R}}^3\). Then we represent the centerline of the rod with a curve \(r \in C^2([0,L] \rightarrow {\mathbb {R}}^3)\) of length L. We parametrize the curve with its arc-length parameter \(s \in [0,L]\) and we assume that \(r \in \text {span}\{e_1, e_3\}\). For the rod cross-section, we consider as generalized frame (Goriely 2017) the Frenet mobile system \(\{ \nu , \beta , \tau \}\), formed, respectively, by the normal, binormal and tangent vectors. Naming \(\theta (s)\) the angle between \(\tau (s)\) and \(e_3\) (that is, the vertical line), the curvature \(\kappa (s)\) is simply
where “\( ' \)” denotes the derivative with respect to s. Since r is confined in the plane, it is entirely described by its initial point r(0) (whose value is fixed), initial inclination \(\theta (0)\) and its curvature \(\kappa \).
We want to account for the gravity force acting on the rod r, \(\{ \nu , \beta , \tau \}\). To achieve this aim, we consider another configuration \({\hat{r}}\), \(\{ {\hat{\nu }}, {\hat{\beta }}, {\hat{\tau }} \}\) confined in \(\{e_1,e_3\}\). We refer to this latter configuration as intrinsic configuration, and it represents the shape that the rod would have in the absence of the passive elastic deflection due to its weight. Instead, we refer to the former configuration as the current configuration. In our model, we assume that the intrinsic configuration is just a straight line starting from r(0) and directed along \(e_1\). In particular, this means that we consider the intrinsic curvature \({\hat{\kappa }}\) constantly null all along the stem. In response to the gravity force, the rod generates an internal force n and an internal moment (of force) m. Assuming that the gravity force is oriented along \(-e_3\) and that it is balanced by the rod’s response, we get the following set of equations:
where g is the gravity acceleration constant and \(\rho \) is the density per unit of length of the rod. We refer to \(\rho \) as linear density. To close the set of differential equations, we need a constitutive relationship between the internal moment and the difference between the intrinsic and the current curvature, \({\hat{\kappa }} \equiv 0\) and \(\kappa \), respectively. The Euler–Bernoulli law provides a classical relationship of this kind, which in the planar case reads the following form:
In this equation, the quantity E is the Young’s modulus and it measures the stiffness of the rod. Furthermore, the quantity I is the second moment of area of the cross-section along the direction given by the binormal vector \(\beta \). In our case, we consider a circular cross-section of radius R. So, we have
Considering together system (1) and Eq. (2), we obtain
where we are also assuming that there is not any external load at the rod’s tip. This means that \(n(L) = 0\).
2.1 Formulation of the Problem
As expressed by Eq. (1), when an elastic rod is subject to external forces, the material generates the internal moment m in response (Goodno et al. 2020). This moment causes the deflection of the rod. The internal pressure that generates m is called (bending) stress and we name it \(\sigma \). In the case of a circular cross-section, the maximal stress developed by the rod at r(s) is
We assume that the maximal stress \(\sigma _M\) cannot cross a certain fixed threshold \({\bar{\sigma }}\). The mass of the shoot is represented by the linear density \(\rho \) of the main stem, which is related to the density per unit of volume \(\rho _3\) and the radius R by the equation
To formulate our problem, we consider the volume density \(\rho _3\) and the Young’s modulus E constant along the shoot. Furthermore, we assume that the main stem does not have any secondary branches or leaves. Then, we represent the optimal control problem of the shoot length maximization with the following system:
where
The boundary conditions mean that we are fixing an initial inclination of the rod equal to \(\theta _0\) and that we are considering just the weight of the rod, without any extra element at the tip, so the intrinsic and the current curvatures coincide. The total mass of the main stem is given by
In the above expression, \(\rho _3\) is a constant, hence the constraint \(\int _0^L R^2(\sigma ) \textrm{d}\sigma = M\) means that we are fixing the total mass to the value \(\rho _3 \pi M\). The set of the controls \({\mathcal {U}}\) will be specified later on in the presentation.
We introduce the variables:
To avoid pathological and rather unrealistic situations, we assume to have an upper bound and a lower bound on the variable u, that is \(u_M \ge u \ge u_m\). Using these new variables and conditions, we obtain from (4) the optimization problem:
![figure a](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00332-023-10011-5/MediaObjects/332_2023_10011_Figa_HTML.png)
where
-
\(\theta _0 \in [0, 2 \pi ]\) and \(c_1,c_2, u_m, u_M, M \in (0,+\infty )\);
-
\(x = (\psi , \theta , \mu ) \in W^{1,1}([0,L];{\mathbb {R}}^3)\);
-
\({\mathcal {U}} = \{ (u,L) \,: \, L \ge 0 \text { and } u:[0,L]\rightarrow {\mathbb {R}} \;\;\textrm{Lebesgue}\;\;\textrm{measurable} \}\);
-
\(f_P(x,u) = (c_1 \mu \sin \theta , -\psi /u^2, -u)\);
-
\(C_{P,0} = {\mathbb {R}} \times \{ \theta _0 \} \times \{ M \}\);
-
\(C_{P,1} = \{ 0 \} \times {\mathbb {R}} \times \{ 0 \}\);
-
\(h_{P,1}(x,u) = \max \{u_m - u,u - u_M\}\);
-
\(h_{P,2}(x,u) = |\psi | - c_2 u^{3/2}\).
We will see that the upper bound \(u_M\) for u is a fundamental assumption to prove the existence of an optimal solution in the convexified case. In general, the imposition of an upper bound on the set of controls may influence the solution to the optimization problem itself. As we will see, if \(u_M\) is large enough, this is not the case.
2.2 Notation for Multifunctions
In the following, we denote with
a multifunction from \( X \subset {\mathbb {R}}^n\) to \({\mathbb {R}}^m\), that is a function whose domain is X and such that \(F(x) \subset {\mathbb {R}}^m\) for every \(x \in X\). The multifunction F is said Borel measurable if for any open set \(A \subset {\mathbb {R}}^m\), the set
is a Borel subset of \(X \subset {\mathbb {R}}^n\). Moreover, we say that the multifunction F is closed, convex or nonempty if for any \(x \in X\) the set \(F(x) \subset {\mathbb {R}}^m\) is, respectively, closed, convex or nonempty. We say that F is uniformly bounded if there exist a constant \(\alpha > 0\) such that
for any \(x \in X\). With \({\mathbb {B}}^m\) we denote the closed unitary ball centred at the origin of \({\mathbb {R}}^m\). The graph of F is the set
3 Existence of Optimal Solutions
In this section, we construct a “minimal" modification of problem (P) in order to obtain an optimal control problem for which the existence of an optimal solution is guaranteed. To this aim, we first construct such an enlarged optimal control problem in Sect. 3.1 and then we show that the latter has a feasible solution in Sect. 3.2.
3.1 Relaxation
To start with, we observe that the dynamics of the optimal control problem (P) can be reformulated in terms of a differential inclusion by defining
with
where
It is well known that a key assumption for the existence of a solution for an optimal control problem is the convexity on the set of admissible velocities F (see e.g. Vinter et al. (2010)). However, it is easy to see in our case that such a standard existence hypothesis is not verified. A standard procedure to overcome such an issue, known as relaxation, is to enlarge the set of admissible trajectories in a way that the existence of a maximum is guaranteed. To this aim, we consider the convexified version of problem (P):
![figure b](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00332-023-10011-5/MediaObjects/332_2023_10011_Figb_HTML.png)
Here we are using the notation:
-
\(x = (\psi ,\theta ,\mu ) \in W^{1,1}([0,L];{\mathbb {R}}^3)\);
-
\(F_d(x) = {\overline{co}}\left\{ F(x) \right\} \);
-
\(C_{d,0} = {\mathbb {R}} \times \{ \theta _0 \} \times \{ M \} \)
-
\(C_{d,1} = \{ 0 \} \times {\mathbb {R}} \times \{ 0 \}\)
We say that \((x,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times [0,+\infty )\) is a trajectory for (\(p_d\)) if it satisfies the differential inclusion, that is \(x'(s) \in F_{d}(x(s))\) for a.e. \(s \in [0,L]\). A trajectory for (\(p_d\)) is admissible if it satisfies also the constraint \((x(0),x(L)) \in C_{d,0} \times C_{d,1}\). Analogously, we say that \((x,u,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times {\mathcal {U}}\) is a trajectory for (P) if \(x' = f_P(x,u)\) a.e. in [0, L] and it is admissible if all the constraints are satisfied. By construction, we observe that if (x, u, L) is an admissible trajectory for (P) then (x, L) is an admissible trajectory for (\(p_d\)).
Analogue terms are used for problem (\(p_c\)) in Sect. 4.1.
Proposition 1
The multifunction \(F_d\) has the following characterisation:
where
Proof
To prove Proposition 1, first we make the following consideration. Fix a point \(x = (\psi , \theta ,\mu ) \in {\mathbb {R}}^3\) and define
Then
Hence, we can focus on the set
We want to prove that
Let us use g(u) to denote the function
and with \(\ell : [u_0,u_M] \rightarrow {\mathbb {R}}\) the straight line that joins the point \((u_0,g(u_0))\) to the point \((u_M,g(u_M))\), that is
Define
where
and
Since g is a convex and continuous function and \(hyp(\ell )\) is a convex set, C is a convex closed set containing \(\left\{ \left( \frac{\psi }{u^2}, u \right) \,: \, u \in [u_0,u_M]\right\} \). In particular, one has that \(A \subseteq C\).
Step 1: \(C \subseteq B\).
Consider \(({\bar{u}},{\bar{v}}) \in C\). Then one has that
Take now the straight line \(\ell _0:[u_0,u_M] \rightarrow {\mathbb {R}}\) with the same slope of \(\ell \) and such that \(\ell _0({\bar{u}}) = {\bar{v}}\), that is
By construction, one has that
Hence, by using the continuity \(\ell _0\), there exist \(u_1\) and \(u_2\) satisfying \(u_1 \le {\bar{u}} \le u_2\) and such that
which means that the point \(({\bar{u}},{\bar{v}})\) is a convex combination of the points
This shows that \(C\subseteq B\).
Step 2: \(A = B\).
Since it follows from Step 1 that \(C \subseteq B\), one also has \(A \subseteq B\). On the other hand, it follows from the definition of B that also the inclusion \(B \subseteq A\) holds. Hence one has that \(A = B\). This completes the proof. \(\square \)
3.2 Existence of Relaxed Optimal Solutions
In this section, we will show the existence of an optimal solution for (\(p_d\)). To achieve this result, we begin by proving the existence of at least one admissible trajectory.
Proposition 2
Fix the constants
and choose
![figure c](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00332-023-10011-5/MediaObjects/332_2023_10011_Figc_HTML.png)
Then there exists an admissible trajectory (x, u, L) for (P).
Consequently, also problem (\(p_d\)) has at least one admissible trajectory.
Proof
To prove the statement, we make use of a fixed point argument. Let us fix the constant control \(u = M/L\); for \(L \le M/u_m\) we have \(u \ge u_m\), which implies that the lower bound given by \(h_{P,1}\) is satisfied. Now, let us consider \(\mu (t) = (L - t)M/L\). It is easy to observe that the trajectory \(((\psi , \theta + \theta _0, \mu ), M/L, L)\) is admissible for problem (P) if and only if \((\psi ,\theta , \mu )\) solves the system
for all \(t\in [0,L]\). We define \(X = \{ \psi ,\theta \in C([0,L]): \, \psi (L) = \theta (0) = 0 \}\). Then \((X,|| \cdot ||_{\infty })\)Footnote 1 is a Banach space. Consider the function \(F: X \rightarrow X\)
To prove the existence of a solution to system (5) we just need to prove that for L small enough, F is a contraction and the inequality for \(|\psi |\) is satisfied.
Let be \((\psi ,\theta ), ({\bar{\psi }},{\bar{\theta }}) \in X\).
This means that for \(L < \min \left\{ \frac{1}{c_1 M}, M^{2/3} \right\} \), F is a contraction. Moreover, we notice that \(||\psi '||_{\infty } \le c_1 M\). The bound on the derivative and the terminal condition let us notice that \(||\psi ||_{\infty } \le c_1 L M\). Then, the inequality on \(\psi \) is always satisfied if
which gives the upper bound \(L \le \left( \frac{c_2}{c_1} \sqrt{M} \right) ^{2/5}\).
Collecting all the upper bounds for L, we can define
and we can take \(u = M/{\bar{L}}\). So, if condition (\({HP_{max}}\)) is satisfied, then \(u \le u_M\). That is, the trajectory \(((\psi ,\theta + \theta _0,\mu ), u, {\bar{L}})\) is admissible for (P) and consequently \(((\psi ,\theta + \theta _0,\mu ), {\bar{L}})\) is an admissible solution for (\(p_d\)). \(\square \)
To prove the existence of a minimizing trajectory we need a bound on the initial condition of the optimal trajectory. This property descends from the limited mass M at our disposal and the boundedness of the dynamic.
Proposition 3
The set of admissible trajectories for (\(p_d\)) does not change if we replace \(C_{d,0}\) defined in (\(p_d\)) with
Moreover, for any admissible trajectory \((x,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times [0, +\infty )\) we have
for every \(s \in [0,L]\).
Proof
Let \(x = (\psi ,\theta ,\mu )\) be an admissible trajectory for (\(p_d\)). Since \(\mu (0) = M > 0 = \mu (L)\), we must have \(L > 0\). Moreover, one has that
which implies the bound
The bound on \(\mu \) follows immediately from the dynamics. Indeed, by definition of \(F_d\), one has \(\mu ' \le 0\). Hence
For what concerns the variable \(\psi \), we have
which gives the bound for \(\psi \) in [0, L].
Finally, by taking into account the equation for \(\theta '\), we observe that
Hence, all the bounds of the thesis are verified. \(\square \)
To prove the existence of a maximizer we employ a compactness theorem stated in Proposition 2.5.3 in Vinter et al. (2010), and the bound on the admissible lengths L stated in Proposition 3.
Theorem 1
Assume (\({HP_{max}}\)) holds. Then, problem (\(p_d\)) admits a maximizer.
Proof
Define the closed and bounded set \(X \subset {\mathbb {R}}^3\),
In view of Proposition 3, for any admissible trajectory \((x,L) \in W^{1,1}([0,L];{\mathbb {R}}^3) \times [0,+\infty )\), one has that \((\psi , \theta , \mu )(t)\in X\) for all \(t\in [0,L]\).
Step 1: \(F_d: {\mathbb {R}}^3 \rightrightarrows {\mathbb {R}}^3\) is a closed, convex, nonempty, Borel measurable multifunction.
\(F_d\) is closed, convex and nonempty by definition. Concerning the Borel measurability, define
Take any open set \(A \subset {\mathbb {R}}^3\). By taking into account Proposition 1 and the continuity of \(f_{u_1,u_2,\lambda }(x)\) with respect \(u_1,u_2,\lambda \) and x, we have
Hence, \(F^{-1}(A)\) is a countable union of Borel sets, that is a Borel set itself.
Step 2: \(F_d\) has a closed graph. Let \((x_n)_n\) and \((v_n)_v\) two sequences in \({\mathbb {R}}^3\) such that for each n,
and \(x_n \rightarrow x\), \(v_n \rightarrow v\) for some \(x,v \in {\mathbb {R}}^3\). We want to prove that \(v \in F_{d}(x)\).
It follows from Proposition 1 that, for each n, there exist \(u_{1,n}, u_{2,n}, \lambda _n\) such that
Since \(u_{1,n},u_{2,n} \in [u_m,u_M]\) and \(\lambda _n \in [0,1]\) for each n, then there exist \(u_1,u_2 \in [u_m,u_M]\) and \(\lambda \in [0,1]\) such that \(u_{1,n} \rightarrow u_1\), \(u_{2,n} \rightarrow u_2\) and \(\lambda _n \rightarrow \lambda \) at least along a subsequence. Thus, by continuity, we have
that is exactly \(v \in F_{d}(x)\).
Step 3: there exists \(\alpha > 0\) such that \(F_{d}(x) \subset \alpha {\mathbb {B}}^3\) for any \(x \in X\). It follows from the definition of X and the boundedness of the control set that, for any \(x \in X\), we have
where we recall that \({\mathbb {B}}^3 \subset {\mathbb {R}}^3\) is the closed unitary ball centred at the origin.
Step 4: Passing to the limit from a maximizing sequence.
Consider a maximizing sequence of admissible trajectories \((x_n,L_n)_n\) for problem (\(p_d\)). In view of Proposition 2, such a sequence exists. It follows from Proposition 3 that the end-time sequence \((L_n)_n\) is bounded. So, along a subsequence (we do not relabel), there exists \(L > 0\) such that \(L_n \rightarrow L\). Furthermore, it is not restrictive to assume that \(L_n \le L\) for all n sufficiently large along a subsequence (the case \(L_n \ge L\) can be treated by using similar arguments). Hence, we can extend \(x_n\) to the whole [0, L] by defining
so that \(y_n \in W^{1,1}([0,L];{\mathbb {R}}^3)\) with \(y'_n = x'_n\) in \([0,L_n]\) and \(y'_n \equiv 0\) in \([L_n,L]\). It follows from Proposition 3 that \(y_n \in X\) for every n. Since \(F_d\) restricted to X is bounded, \(y'_n \in F(y_n)\) a.e. in \([0,L_n]\) and \(y'_n \equiv 0\) in \([L_n,L]\) for every n, the sequence \((y'_n)_n\) is uniformly essentially bounded. Moreover, by invoking again Proposition 3, one has that \((y_n(0))_n\) is a bounded sequence.
Let us define \(A_n = [0,L_n]\) and observe that \({\mathcal {L}}(A_n) = L_n \rightarrow L\) as \(n \rightarrow +\infty \). It follows from Proposition 2.5.3 of Vinter et al. (2010) that there exists a function \(x \in W^{1,1}([0,L];{\mathbb {R}}^3)\) such that \(x' \in F(x)\) a.e. in [0, L] and \((x(0),x(L)) \in C_{0,d} \times C_{1,d}\), that is (x, L) is an admissible trajectory for (\(p_d\)). Since L is the maximal value for problem (\(p_d\)), then (x, L) is a maximizer for (\(p_d\)). This concludes the proof. \(\square \)
4 Necessary Conditions
4.1 Reformulation
Proposition 1 characterizes the velocity set \(F_d\). Using such a characterization, we recast Problem (\(p_d\)) into the following system:
![figure d](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00332-023-10011-5/MediaObjects/332_2023_10011_Figd_HTML.png)
with
-
\(x = (\psi , \theta , \mu ) \in W^{1,1}([0,L];{\mathbb {R}}^3)\);
-
\(u = (u_1,u_2,\lambda )\);
-
\({\mathcal {V}} = \{(u, L):\, \, u:[0,L]\rightarrow {\mathbb {R}}\times {\mathbb {R}}\times [0,1] \;\;\textrm{Lebesgue}\; \textrm{measurable}, \;\) \(\text { }L \in [0,+\infty )\}\);
-
\(f_c(x) = \left( c_1 \mu \sin (\theta ),- \psi \left( \frac{\lambda _1}{u_1^2} + \frac{\lambda _2}{u_2^2} \right) ,- (\lambda _1 u_1 + \lambda _2 u_2) \right) \) with \(\lambda _1 = \lambda \) and \(\lambda _2 = 1 - \lambda _1\);
-
\(h_c(x,u) = \max \left\{ |\psi | - c_2 u_i^{3/2}, u_m - u_i, u_i - u_M, - \lambda , \lambda - 1 \right\} ,\)
where the use of the subscript i means that the maximum is taken over both \(i = 1,2\).
In view of Proposition 9 (see Appendix), problem (\(p_c\)) is equivalent to problem (\(p_d\)).
Proposition 4
Let (x, u, L) and admissible trajectory for problem (\(p_c\)) such that \(\theta (0) = \pi /2\). Then if one of the following two conditions holds true
![figure e](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00332-023-10011-5/MediaObjects/332_2023_10011_Fige_HTML.png)
we have:
-
\(\theta \) is increasing and \(\theta (s) \in [\pi /2,\pi )\) for all \(s \in [0,L]\);
-
\(\psi \) is increasing and \(\psi (s) < 0 \) for all s in [0, L).
Proof
Assume \(u_1 \le u_2\). From the velocity constraint \(f_p\) we observe that
Furthermore, from the state constraint, we have that
Consequently it follows from (6), (7) and from Proposition 3 that
So, if \(c_2 < \frac{\pi \sqrt{u_m}}{2L}\), for every \(s \in [0,1]\)
and we get that \(\theta \in (0,\pi )\). On the other hand, if \(c_1 < \frac{\pi }{2} \left( \frac{u_m}{M} \right) ^3\), by proposition 3 we obtain the same conclusion.
The bound \(\theta \in (0,\pi )\) determines the signs of \(\psi '\) and \(\psi \):
Using the above relations, we refine the bound on \(\theta \) and get the monotonicity property:
which means that \(\theta \) is increasing and \(\theta \in (\pi /2, \pi )\), concluding the proof. \(\square \)
Remark 1
In Proposition 4 the condition for the constant \(c_2\) depends on the length L of the admissible trajectory. However, using the upper bound on L given in Proposition 3, it is possible to obtain
so that if \(c_2\) satisfies this condition, then (\(HP_2c\)) holds for any admissible trajectory.
4.2 Notations for Basic Non-smooth Analysis
The mixed constraint \(h_c\) leads to a formulation of the Pontryagin’s maximum principle that involves just absolutely continuous adjoint trajectories. This version of the Pontryagin’s maximum principle can be found in theorem 2.1 in Clarke et al. (2010). Before discussing the necessary conditions for the optimization problem (\(p_c\)), we fix some essential notations of non-smooth analysis.
Given a non-empty closet set \(S \subset {\mathbb {R}}^n\), we define the proximal normal cone to S at a point \(x \in S\) as
Any element of the proximal normal cone is called proximal normal vector. We define the limiting normal cone (also known as Mordukhovich’s normal cone) as
and we also define the generalized normal cone (also known as Clarke’s normal cone) as
It is clear from the definitions that
In an analogous way, we define proximal, limiting and generalized subgradients for a lower semicontinuous function \(f: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\):
and
There is a well-known relation between the level sets of a function and its subgradient, which is stated in the following theorem (see Theorem 11.38 in Clarke et al. 2013).
Theorem 2
Let f be a locally Lipschitz function and define
Fix \(x \in S \) such that \(f(x) = 0\). If \(0 \notin \partial ^L f\), then
The proof of this statement can be found in theorem 11.38 in Clarke et al. (2013). Another useful theorem is the so-called max rule (see for instance theorem 5.5.2 in Vinter et al. 2010)
Theorem 3
Let \(f_i: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\), \(i = 1,...,m\) a collection of m locally Lipschitz continuous functions. Define \(f: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) as
and
Then
4.3 A Pontryagin’s Maximum Principle
4.3.1 The Bounded Slope Condition
To derive the necessary conditions for (\(p_d\)), it is important to verify the Lipschitz continuity and the boundedness of \(F_d\) (see for instance section 2.3 in Clarke et al. (2005)). These properties are implied by the bounded slope condition, which requires a relation between the partial derivatives of \(F_d\). A single-valued function is easier to manage than a multifunction, so we look at this condition for problem (\(p_c\)). A single-valued formulation of the bounded slope condition can be found in condition \(\varvec{BS_*^{\varepsilon ,R}}\) of Clarke et al. (2010), which we summarize here.
Constraint set given by the conditions \(u \in [u_m,u_M]\) and \(|\psi | \le c_2 |u|^{3/2}\). The black arrows represent the normal vectors to the edges of the sets; in the corners A, B, C, D, since we have a discontinuity, those normal vectors become normal cones. The value \(u_{\max }\) is the one given by condition (\({HP_{\max }}\)), while \(\psi _{\max }\) is the corresponding maximum value of \(\psi \), that is \(\psi _{\max } = (c_1 M^2)/u_m\). The bounded slope condition in this situation can be translated in this geometrical concept: there must not be arrows vertically oriented near the optimal trajectory. As shown by the orange arrows in the figure, corners B and C are clearly a threat to this condition. However, since for any admissible trajectory we have \(|\psi | \le \psi _{\max }\), the optimal trajectory (and more in general, any admissible trajectory) will never reach the green area of the figure
Assume we are considering an optimisation problem whose admissible trajectories (x, u) (where u is the control) satisfy the constraint \((t,x(t),u(t)) \in S\), where S is a closed set. Let \((x^*,u^*, L^*)\) be an optimal process and \(R(t) > 0\) an arbitrary positive measurable function. Define a tube around the optimal process \((x^*,u^*, L^*)\) in S
for a.e. \(t\in [0,L^*]\). The bounded slope condition requires the existence of a measurable real-valued function \(k_S\) such that
In our case, \(S = [0,L^*] \times \{(x,u) \,: \, h_c(x,u) \le 0 \}\) and in the following proposition, we verify the bounded slope condition for a fixed R. An intuitive idea of why in our case this condition holds is given in Fig. 1.
Proposition 5
Let
If
![figure f](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00332-023-10011-5/MediaObjects/332_2023_10011_Figf_HTML.png)
then there exist \(\varepsilon _c, K_c \in [0,+\infty )\) such that for any admissible trajectory (x, u, L) for problem (\(p_c\)) and for any \(s \in [0,L]\)
Notice that we prove a stronger condition than the bounded slope condition, since we show such a property in the neighbourhood of any admissible process (x, u, L) and not just for the optimal process \((x^*,u^*,L^*)\).
Proof
Consider the functions
Each of the above functions is smooth in E and by definition
In the notations of theorem 3
and by its application, we have
We can compute explicitly the gradients of the functions:
We want to see that for any \(g = (g_1,..., g_6) \in G(x,u)\), we have \((g_4,g_5,g_6) \ne (0,0,0)\). The case \((g_4,g_5,g_6) = (0,0,0)\) can happen only in the following situations:
-
1.
\(\tilde{\lambda } \nabla b_i + \tilde{\lambda } \nabla c_i = 0 \) for some \(\tilde{\lambda } > 0\);
-
2.
\(\tilde{\lambda } \nabla d + \tilde{\lambda } \nabla e = 0\) for some \(\tilde{\lambda } > 0\);
-
3.
\(\lambda _{a,i} \nabla a_i + \lambda _{c,i} \nabla c_i = 0\) for some \(\lambda _{a,i}, \lambda _{c_1} > 0\).
Cases 1 and 2 cannot happen. Indeed, assume for instance that case 1 holds true. Since \(\tilde{\lambda } > 0\), this is possible only if \(b_i(x,u) = c_i(x,u) = 0\), consequently \(u_i = u_m = u_M\) and this is not possible. Similar reasoning can be applied to case 2.
Now, we want to show that for (x, u) sufficiently close to (x(s), u(s)) for any \(s \in [0,L]\), case 3 never occurs. Indeed, to have \(\lambda _{a,i}, \lambda _{c,i} > 0\), we need
By proposition 3, we know that
So, if \(u_M\) satisfies hypothesis (\(HP_{max}^1\)), then (x(s), u(s)) cannot satisfy the conditions in (9), because
So, if we take a constant \(\varepsilon _c > 0\) such that
then, for any \(s \in [0,L]\) and for any \((x,u) \in E\) such that
we have
Consequently, system (9) cannot be satisfied.
Thus, for any \((\alpha ,\beta ) \in G(x,u)\) with (x, u) satisfying condition (10), where \(\alpha \) considers the partial derivatives with respect to the x and \(\beta \) the partial derivatives with respect the control u, we have
So, by choosing
we always have \(|\alpha | \le K_c |\beta |\). We also observe that in this situation, we always have \(|(\alpha ,\beta )| \ne 0\). For this reason, we can apply theorem 2 and consequently
which concludes the proof. \(\square \)
Remark 2
The set G(x, u) defined in (8) is convex and closed. This means that also the set
is convex and closed. Consequently, we also have the inclusion
where \(E = \{(x,u) \,: \, h_c(x,u) \le 0 \}\).
For any fixed \(\xi \in N_E^C(x,u)\), we use \(\xi _{\psi }\) to denote the component of \(\xi \) corresponding to \(\psi \). The notations \(\xi _{u_i}\) and \(\xi _{\lambda }\) have a similar meaning. Furthermore, in view of the proof of Proposition 5, we can observe that
4.3.2 The Adjoint System
Given \({\bar{L}} \ge L > 0 \) and a measurable function \(x: [0,L] \rightarrow {\mathbb {R}}^n)\), we define
We say that an admissible trajectory \((x^*,u^*,L^*)\) is an \(\varepsilon \)-local maximum for (\(p_c\)) if for any other admissible trajectory (x, u, L) such that
for a.e. \(s \in [0,{\bar{L}} = \max \{L^*,L\}]\), the inequality \(L^* > L\) holds.
Theorem 4
Assume (\(HP_{max}^1\)) is satisfied and let \(\varepsilon _c\) be the constant of proposition 5. Let \((x^*,u^*,L^*)\) be an \(\varepsilon _c\)-local maximum for (\(p_c\)) and define the set of the constraints
Then, there exists an arc \(p = (p_{\psi },p_{\theta },p_{\mu }) \in W^{1,1}([0,L^*];{\mathbb {R}}^3)\) and a number \(\lambda _0 \in \;\{0,1\}\) satisfying the non-triviality condition
such that
and
almost everywhere in \([0,L^*]\), where \(\xi : [0,L^*] \rightarrow {\mathbb {R}}^7\) is a measurable function such that \(\xi (s) \in N_E^C(x^*(s),u^*(s))\) for a.e. \(s \in [0,L^*]\). Let
be the unmaximized Hamiltonian. Then, the Weierstrass condition holds. So, for almost every \(s \in [0,L^*]\) and for every \(u \in {\mathbb {R}}^3\) such that \((x^*(s),u) \in E\) and \(|u - u^*(s)| \le \varepsilon _c\), we have
Notice that the non-triviality condition holds for every \(s \in [0,L^*]\).
Proof
We reformulate the free end-time problem (\(p_c\)) into a fixed end-time maximization by a transformation of the independent variable. So, we consider the following system
where \(\tau \) is now a further control function. It is clear that if \((x^*,u^*,L^*)\) is a \(\varepsilon _c\)-local maximum for (\(p_c\)), then \((x^*,(u^*,\tau ^*\equiv 1))\) is a \(\varepsilon _c\)-local minimum of (14). The conclusion follows from an application of the Pontryagin’s maximum principle (see Theorem 2.1 of Clarke et al. (2010)). Indeed, in view of Proposition 5, we know that the bounded slope condition \(\varvec{BS_*^{\varepsilon ,R}}\) holds true for (14). It remains to verify the Lipschitz continuity of the function
for \((\tilde{x},\tilde{u})\) in the set
for a.e. \(s \in [0,L^*]\). Thanks to Proposition 3, we know that \(|\psi ^*|\) and \(\mu ^*\) are bounded. This means that \(T_c(s)\) is a compact set. Moreover, \((x,0) \notin T_c(s)\) for any s. Consequently, \(f_L\) is smooth in \(T_c(s)\) and the Lipschitz continuity follows from the compactness of \(T_c(s)\). \(\square \)
5 Properties of the Adjoint Arc
In this section, we analyse the behaviour of the adjoint arc p in Theorem 4, in order to gain more information on the optimal control. From now on, we assume that the initial inclination of the elastic rod is
Remark 3
In the notations of Theorems 3 and 4, for any \(s \in [0,L^*]\) such that \(h_{c}(x^*(s),u^*(s)) = 0\), given \(\xi (s) \in N_{E}^C(x^*(s),u^*(s))\), we have
with \(\tilde{\lambda } \ge 0\) and \((\lambda _{a,1},..., \lambda _{e}) \in \Lambda \). Recalling Remark 2, it follows from Proposition 4 that
Moreover, recalling the proof of Proposition 5, if (\(HP_{max}^1\)) holds and \(u_i = u_M\), then \(\xi _{u_i} \ge 0\).
Proposition 6
Let \((x^*,u^*,L^*)\) be an optimal trajectory for problem (\(p_c\)). Let p be an adjoint arc which satisfies system (11) and assume that hypothesis (\(HP_{max}^1\)) and (\(HP_2c\)) hold. Then there exists \({\bar{s}} \in [0,L^*]\) such that if \({\bar{s}} < L^*\), then
Otherwise, if \({\bar{s}} = L^*\), only the condition on \(p_{\theta }\) holds, that is, \(p_{\theta }(s) > 0\) for \(s \in [0,L^*)\).
Proof
Let us define the functions
Since \((x^*,u^*)\) is fixed, a and b can be regarded as the time-dependent functions appearing in (11). Furthermore, in view of hypothesis (\(HP_2c\)), it follows from Proposition 4 that \(a,b > 0\) in \([0,L^*)\). Then the adjoint system (11) can be written as
where \(\xi (s) \in N_{E}^C(x^*(s),u^*(s))\).
As a first step in the proof of Proposition 6, we will prove the following claim.
Claim: there does not exist \(s_0 \in [0,L^*)\) such that
Let us define
It follows from the assumption (16) that \(s_1 > s_0\). If \(s_1 \le L^*\), then \(p_{\theta }(s) < 0\) for \(s \in [s_0, s_1)\) and \(p_{\theta }(s_1) = 0\). Hence, it follows from Proposition 4, the condition (16) and Remark 3 that
and, by continuity of \(p_{\psi }(\cdot )\), there exists \(\delta > 0\) such that
However, by appealing again to Proposition 4, condition (16) and (17), one has that
which contradicts the relation \(p_{\theta }(s_1) = 0\). Hence the condition \(s_1\le L^*\) is not satisfied, implying that \(s_1 = +\infty \). But even this situation cannot occur since in particular it implies \(p_{\theta }(L^*) < 0\), contradicting the boundary condition \(p_{\theta }(L^*) = 0\). Hence, the condition (16) cannot occur and this proves the claim.
The main implication of the claim is to rule out the possibility to have the initial condition \(p_{\theta }(0) < 0\). Indeed, this follows immediately from an application of the claim statement combined with the results of Proposition 4. The following situations remain then to be studied:
Case 1: \(p_{\theta }(0) = 0\).
In this case, we will show that one can only have \(p_{\psi } \equiv p_{\theta } \equiv 0\). To achieve this goal, we consider the linear part of (14):
Let M be the fundamental matrix solution to (15), with \(M(0) = \text {Id}\). Then \(M_{i,j}\) are non-negative and increasing in \([0,L^*]\). Since by remark 3\(\xi _{\psi } \le 0\) and since \(a,b > 0\) in \([0,L^*)\), then by using the Duhamel formula for Equation (14), we obtain \(p_{\psi }, p_{\theta } \le 0\) in \([0,L^*]\). Then, by the dynamics (14) and Claim 1, the only possibility is to have \(p_{\psi } \equiv p_{\theta } \equiv 0\).
By the dynamics (15) and the claim, the only possibility is to have \(p_{\psi } \equiv p_{\theta } \equiv 0\).
Case 2: \(p_{\theta }(0) > 0\).
Let us set
Then one has that \({\bar{s}} > 0\) and \(p_{\theta }({\bar{s}}) = 0\). If \({\bar{s}} = L^*\), then \(p_{\theta }(s)>0\) for all \(s\in [0,L^*)\) and we have nothing to prove. If \({\bar{s}} < L^*\), by arguing again as in the proof of the claim, one can show that it is not possible to have \(p_{\psi }({\bar{s}}) > 0\), because this would imply \(p_{\theta }({\bar{s}}) > 0\), contradicting the definition of \({\bar{s}}\). Furthermore, one cannot obtain \(p_{\psi }({\bar{s}}) < 0\). Indeed, if \(p_{\psi }({\bar{s}}) < 0\), then by continuity of the adjoint arc \(p_{\psi }(\cdot )\), there exists \(\delta > 0\) such that \(p_{\psi }(s) < 0\) for any \(s \in [{\bar{s}} - \delta , {\bar{s}} + \delta ]\). It then follows that
However, the latter inequality contradicts the statement of the claim, providing a contradiction. Hence, the only possibility is that \(p_{\psi }({\bar{s}}) = 0\) and by applying the arguments of Case 1 in the interval \([{{\bar{s}}}, L^*]\), one has that \(p_{\psi } \equiv p_{\theta } \equiv 0\) in \([{\bar{s}},L^*]\) and that \(p_{\theta }(s)>0\) for all \(s\in [0,{\bar{s}})\). This completes the proof. \(\square \)
Proposition 7
Let \((x^*,u^*,L^*)\) be an optimal trajectory for problem (\(p_c\)). Let \(p=(p_\psi , p_\theta , p_\mu )\) be an adjoint arc which satisfies system (11) and assume that hypotheses (\(HP_{max}^1\)) and (\(HP_2c\)) hold true. Then
Proof
We will structure the proof of the proposition in three main steps.
Step 1: One has that \(p_{\mu }(0) \ge 0\). Assume by contradiction that \(p_{\mu }(0) < 0\). By continuity of \(p_{\mu }(\cdot )\), there exist constants \(\varepsilon ,\delta > 0\) such that
for a.e. \(s\in [0,\delta ]\). Hence, by using the first equation in (12), one has that
On the other hand, it follows from Proposition 6 that \(p_{\theta } \ge 0\) and from Proposition 4 that \(\psi ^*\le 0\) for all \(s\in [0,L^*)\). So, by using again the first equation of system (12), one obtains the relation
and obtains a contradiction. Consequently, this shows that \(p_{\mu }(0) \ge 0\).
Step 2: Let us set
Then \(p_{\psi } \le 0\) for a.e. \(s \in A\).
Indeed, let us recall again that Proposition 6 asserts that \(p_{\theta } \ge 0\) for all \(s\in [0,L^*)\), while Proposition 4 asserts that \(\psi ^*\le 0\) for all \(s\in [0,L^*)\). It then follows from the first equation of system (12) that
that is
This shows the assertion of Step 2.
Step 3: \(p_{\mu }(s) \ge 0\) for all \(s\in [0,L^*]\).
Assume that there exists \(s_0 \in [0,L^*]\) such that \(p_{\mu }(s_0) < 0\) and define
Since \(p_{\mu }(0) \ge 0\), \(s_1 \in [0,s_0)\) and \(p_{\mu }(s_1) = 0\). In view of Step 2, one has that \(p_{\psi }(s) \le 0\) for a.e. \(s \in [s_1,s_0]\). Using the third equation in system (11), we obtain the inequality
and reach a contradiction. This completes the proof of Proposition 7. \(\square \)
The non-negativity of \(p_{\theta }\) and \(p_{\mu }\) are important in the determination of the optimal control. In the following, we use \({\mathcal {L}}\) to denote the Lebesgue measure in \({\mathbb {R}}\).
Theorem 5
Let \((x^*,u^*,L^*)\) be an optimal trajectory for problem (\(p_c\)). Assume that hypothesis (\(HP_2c\)), (\(HP_{max}^1\)) hold true. Then, for a.e. \(s \in [0,L^*]\), one has
Proof
It follows from the control constraints of the optimal control problem that
As a first step, we prove that
that is, \(u_i^*\) cannot be an internal point of the admissible range of control values. Assume that this is not true for \(i = 1\). So, there exists a set \(D \subset [0,L^*]\) such that \({\mathcal {L}}(D) > 0\) and
for every \(s \in D\). Let p be an adjoint arc given by Theorem 4. From system (12) we have
in a.e. D. So, \(u_1^*(s)\) is a.e. a critical point for the Hamiltonian. Let \({\bar{s}}\in [0,L^*]\) be the time which appears in the statement of Proposition 6. It then follows from Propositions 4 and 6 that
If \({\mathcal {L}}(D \cap [0,{\bar{s}}]) > 0\), then conditions (21)–(22) imply that \(u_1^*(s)\) is a minimum for H in a set of positive measure, contradicting condition (13). Therefore, we can assume \(D \subset ({\bar{s}},L^*]\). In view of condition (21), we observe that \(p_{\mu } = 0\) in D. Hence this implies \(p_{\psi } \equiv p_{\theta } \equiv p_{\mu } \equiv 0\) in D. By using the first equation of system (12), one obtains the relation \(\lambda _0 = 0\). However, this implies that the non-triviality condition is violated a.e. in D, reaching a contradiction. Hence it follows that the relation (20) has to be satisfied.
We can now prove the thesis. Assume by contradiction that there exists a set E such that \({\mathcal {L}}(E) > 0\) and \(\lambda _1(s) > 0\), \(u_1(s) = u_M\) for a.e. \(s \in E\). It follows from the second equation of system (12) and from the inequality \(\xi _{u_1}(s)\ge 0\) a.e \(s\in [0,L^*]\) (see Remark 3) that
a.e. in E. On the other hand, it follows from Propositions 6 and 7 that the previous inequality holds true only if \(p_{\mu } \equiv p_{\theta } \equiv 0\) a.e. in E. Then, by using again Proposition 6, this implies that also \(p_{\psi } \equiv 0\) in \([\textrm{ess}\inf (E), L^*]\). Hence, by appealing again to the first equation of system (12), there exists a set with positive Lebsgue measure such that one obtains the relation \(\lambda _0 = 0\). This contradicts the non-triviality condition appearing in the necessary conditions. The case regarding \(u_2\) is analogous. This completes the proof. \(\square \)
Remark 4
Problem (\(p_c\)) is a relaxation of the original Problem (P). Nevertheless, Theorem 5 states that the optimal control is not relaxed, in the sense that \(u_1^* = u_2^*\) a.e. in \([0,L^*]\). Consequently, Theorem 5 shows the equivalence between problems (\(p_c\)) and (P), since in both cases the optimal control (19) leads to the same optimal length \(L^*\).
Remark 5
All the logical passages used until now are valid independently of the normality of the problem. By the first equation of system (12), if the adjoint arc p is null, then also the multiplier \(\lambda _0\) is null. This excludes the possibility to have a trivial p and a non-trivial \(\lambda _0\). However, the vice-versa, and consequently the normality of the problem itself, is not immediate. A multiplier \(\lambda _0 \ne 0\) can keep track of any rescaling operation that involves the adjoint arc p. As we will see in the next section, this leads to a procedure for the numerical computation of the adjoint trajectories.
Remark 6
Problem \((P_c)\) admits only regular extremals. Indeed, by Proposition 4, any admissible trajectory for problem \((P_c)\) has \(\psi \ne 0\) for \(s \ne L\). Let \(D \subset [0,L^*]\) be a subset of positive measure in which some extremal is singular. By the second condition of system (12), any adjoint arc p that corresponds to the singular extremal must have \(p_{\theta }, p_{\mu } \equiv 0\) in D. By Proposition 6, this implies also \(p_{\psi } \equiv 0\) in D, hence also \(\lambda _0 = 0\) in D by the first equation of (12). These relations do not satisfy the non-triviality condition \((p,\lambda _0) \ne 0\) for all \(s \in [0,L^*]\), so Problem \((P_c)\) does not admit singular extremals.
Remark 7
In the proofs of Proposition 7 and Theorem 5, we have used the condition that along the optimal trajectory, the Hamiltonian H is constantly equal to \(- \lambda _0\). This equality holds true because the dynamic used for problem (P) is autonomous, that is, f, \(h_{P,1}\), \(h_{P,2}\) and \({\mathcal {U}}\) does not depend explicitly from \(s \in [0,L]\). This “time" independency follows from our modelling assumptions. For instance, we considered the volume density \(\rho _3\) and Young’s modulus E constant all along the shoot. The inclusion of a time-dependent dynamic would require further analysis of the sign of H.
Numerical integration of system (23) (a—c) and optimal radius (d). As stated in Proposition 4, both \(\psi \) and \(\theta \) are increasing. The former is always negative, while the latter is always in the interval \([\pi /2,\pi )\). As displayed in Fig. 2d, the radius always decreases. Around 0.8 m from the base the radius becomes constant. This is because, at that point, the optimal \(u^*\) reaches the minimal value \(u_m\). Since \(|\psi |\) is decreasing, from that point on, \(u^*\) is constantly \(u_m\)
6 Simulations
If conditions (\(HP_{max}^1\)) and (\(HP_2c\)) hold true, then Theorem 5 determines the optimal control in feedback form. As discussed in Remark 4, this result prescribes \(u_1^* = u_2^*\) a.e., so we can consider \(\lambda _1^* \equiv 1\) and problem (\(p_c\)) becomes equivalent to (P). Therefore, the optimal trajectory solves the boundary value problem
In our parameter pool (see Table 1), we do not have the value of the parameter \({\bar{\sigma }}\). On the other hand, we assume to know the value of the maximal length \(L^*\). So, to integrate system (23), we have to determine those values of \(\psi (0)\) and \(c_2\) such that \(\psi (L^*) = \mu (L^*) = 0\). To achieve these endpoint conditions, we wrote a Matlab script which employs the function bvp5c to solve a boundary value problem. The parameters displayed in Table 1 lead to a constant \(c_1\) which satisfies condition (\(HP_2c\)). Indeed
Since we do not need to set any value of \(u_M\), conditions (\({HP_{max}}\)) and (\(HP_{max}^1\)) are immediately satisfied.
The adjoint arc (a–c) and the Hamiltonian (d). Each graph displays four iterations of the process expressed by Eq. (26). The iterations show that each component of the adjoint arc and the Hamiltonian are converging at least pointwise. We notice that \(p_{\theta }\) and \(p_{\mu }\) are always non-negative. So, the adjoint arc is following the results of Propositions 6 and 7, and the iterations of the Hamiltonian are converging to a function constantly equal to \(-1\)
6.1 Simulation of the Adjoint System
The normal cone in the dynamic of the adjoint arc makes the simulation of the adjoint system a non-trivial procedure. However, the second equation of system (12) and Remark 3 suggest the following heuristic iterative process to generate the component \(\xi _{\psi }\) of the normal vector.
Define \(s_0 \in [0,L^*]\) the value in which
Assume that Eq. (11) has been rescaled with a piecewise constant function \(\gamma _0\) in \([0,L^*]\), so that
The value of \(s_0\) can be estimated by integrating system (23), and for \(t > s_0\) the trajectory x is not activating the constraint on \(|\psi |\), that is, \(|\psi | < u_m^{3/2} c_2\). Consequently, \(\xi _{\psi ,0} \equiv \xi _{\psi } \equiv 0\) in \([s_0,L^*]\). Denote with \(p_{\psi ,0}, p_{\theta ,0}\) the solutions to the boundary value problem
Here, \(p_{\psi ,0}\) and \(p_{\theta ,0}\) are the first two components of the adjoint arc p solution to the adjoint system (11) rescaled by \(\gamma _0\). That is, if p is a solution of the adjoint system (11), then \((p_{\psi }/\gamma _0, p_{\theta }/\gamma _0)\) is a solution of (24) and vice-versa. \(p_{\mu }\) is not considered by system (24) because it does not affect the behaviour of the other components of the adjoint arc and because we do not have any information on its boundary values. However, we can retrieve the values of \(p_{\mu ,0} = p_{\mu }/\gamma _0\) in \([0,s_0]\) by considering the second equation of system (12). So, we have
with \(\xi _{u,0} = \xi _{u}/\gamma _0\).
In the interval \([0,s_0]\) we know that without the rescaling by \(\gamma _0\) we have (see Remark 3)
since \(\xi _{\psi }/\gamma _0 = \xi _{\psi ,0}\), we deduce that in \([0,s_0]\)
This allows us to estimate \(p_{\mu ,0}\) in \([0,s_0]\).
We now make a further assumption: the problem is normal, that is in the first equation of system (12) we have \(\lambda _0 = 1\). Hence, we can estimate \(\gamma _0\) in \([0,s_0]\), since we have
Of course, this reasoning works only if we assume that the rescaling function \(\gamma _0\) is at most a piecewise constant function. However, we can reiterate this procedure considering
and taking \(p_{\xi ,i}\), \(p_{\theta ,i}\) as the solutions to system (24) with \(\xi _{\psi ,i}\) instead of \(\xi _{\psi ,0}\), for \(i = 1,2,...\).
7 Results
The numerical solution of system (23) is displayed in the graphs of Fig. 2. The optimal trajectory respects the conditions of Proposition 6 and 7, as we expect since condition (\(HP_2c\)) is satisfied. In particular, \(\psi \) is negative and increasing, which means that \(|\psi |\) is decreasing. By Theorem 5, the optimal control is directly proportional to \(|\psi |\), consequently the optimal radius \(\sqrt{u^*}\) is decreasing until it reaches the value \(u_m\).
Regarding the simulation of the adjoint system, as displayed in Fig. 3, the sequence of the adjoint arcs \((p_{\xi ,i}, p_{\theta ,i},p_{\mu ,i})\) is converging to some function \((p_{\xi },p_{\theta },p_{\mu })\) and the Hamiltonian is converging to the constant value \(-1\). Again, the simulated trajectories respect the conditions of propositions 6 and 7. So, we observe a positive \(p_{\theta }\) in \([0,L^*)\) and a non-negative \(p_{\mu }\).
8 Discussion
The control u determines the rate of mass decrease \(\mu '\) and without the constraint
it can assume any value in the interval \([u_m,u_M]\). Since we require \(\mu (0) = M\) and \(\mu (L) = 0\), the lower is u, the slower \(\mu \) decreases, the larger L is. In this situation, the optimal strategy would be to take the lowest value u allowed, that is \(u_m\). This reasoning motivates the intuition of relation (19) for the optimal control since in (P) the constraint (27) gives a lower bound for u.
From the relations
where we recall that R is the radius of the cross-section, we can formulate equation (19) as follows:
Equation (28) gives a relation between the curvature \(|\theta '|\) and the radius of the cross-section R. Rephrasing, if we assume that searcher shoots grow optimizing their length, then their cross-section is regulated with a feedback control system. Indeed, at each point of the shoot, the cross-section, together with other physical parameters and forces like gravity, influences the curvature of the shoot. Then, as expressed by equation (28), the resulting curvature provides feedback on the cross-section: the more the shoot deviates from being a straight line, the thinner the cross-section. This closes the feedback loop, which is repeated for the successive points.
The feedback control mechanism is common in biological systems and in particular in plants (Meroz 2021). Consider for instance the gravitropic mechanism. In general, plants are able to perceive their local inclination and their local curvature through some specialized cells (Chauvet et al. 2019; Moulia et al. 2021). This information is compared with a target inclination and a target curvature (Meroz and Silk 2020), inducing a flux of hormones to regulate growth and posture (Meroz et al. 2019; Moulton et al. 2020). The plant then attains a new shape, which has different local inclinations and curvatures, and the cycle is repeated.
Further improvements in the modelling can be achieved by considering for instance (i) variable volume density and Young’s modulus (ii) the mass of the leaves. In addition, \(c_2\) is a dimensionless constant. From Eq. (28), we observe that it is equal to the product between the radius R of the cross-section and the curvature \(|\theta '|\). A comparison with experimental data on radius and curvature would improve the accuracy of the model.
In addition to a deeper insight in plant’s ecological behaviour, our study can address other research fields. In robotics, for instance, Euler–Bernoulli beam models are used to design soft robots like gripping hands (Zhou et al. 2015) or arms (Olson 2020; Sipos and Várkonyi 2020). In the latter case in particular, the problem of the longest self-supporting structure is studied from a stability point of view. A comparison between this kind of results with our study constitutes a stimulating possible research direction.
9 Conclusion
Control theory has an extremely wide range of applications, from the design of mechanical devices to physics, economics and biology. Starting from a physical model of a searcher shoot based on the Euler–Bernoulli theory of elastic rods, we used the optimal control theory to study the behaviour of the radius for the maximisation of the length. To achieve this goal, we formulated system (P), which resulted to be a boundary value problem with nonlinear dynamics and constraints on the state variable. Our approach to this problem consisted of the use of relaxation to convexify the dynamic and the application of the Pontryagin Maximum Principle for mixed state-constrained systems. The resulting optimal control expressed in Theorem 5 proves the equivalence between the original problem (P) and the relaxed one (\(p_c\)). Moreover, it gives a relation between the radius and the inverse of the curvature through the dimensionless constant \(c_2\).
Notes
Here, we use \(|| \cdot ||_{\infty }\) to denote the standard supremum norm in C([0, L]).
References
Bastien, R., et al.: Unifying model of shoot gravitropism reveals proprioception as a central feature of posture control in plants. Proc. Natl. Acad. Sci. 110(2), 755–760 (2013)
Chauvet, H., et al.: Revealing the hierarchy of processes and time-scales that control the tropic response of shoots to gravi-stimulations. J. Exp. Bot. 70(6), 1955–1967 (2019)
Clarke, F.: Necessary Conditions in Dynamic Optimization. American Mathematical Soc, London (2005)
Clarke, F.: Functional Analysis, Calculus of Variations and Optimal Control, vol. 264. Springer, Berlin (2013)
Clarke, F., De Pinho, M.R.: Optimal control problems with mixed constraints. SIAM J. Control. Optim. 48(7), 4500–4524 (2010)
Farjoun, Y., Neu, J.: The tallest column: a dynamical system approach using a symmetry solution. Stud. Appl. Math. 115(3), 319–337 (2005)
Fiorello, I., et al.: Taking inspiration from climbing plants: methodologies and benchmarks—a review. Bioinspir. Biomimet. 15(3), 031001 (2020)
Goodno, B.J., Gere, J.M.: Mechanics of Materials. Cengage learning, Atlanta (2020)
Goriely, A.: The Mathematics and Mechanics of Biological Growth, vol. 45. Springer, Berlin (2017)
Hattermann, T., et al.: Mind the gap: reach and mechanical diversity of searcher shoots in climbing plants. Front. For. Glob. Change 5, 836247 (2022)
Keller, J.B.: The shape of the strongest column. Arch. Ration. Mech. Anal. 5, 275–285 (1960)
Mazzolai, B.: Plant-inspired growing robots. In: Soft Robotics: Trends, Applications and Challenges. Springer, pp. 57–63 (2017)
Mazzolai, B., et al.: Emerging technologies inspired by plants. In: Bioinspired Approaches for Human-Centric Technologies. Springer, pp. 111–132 (2014)
Meder, F., Babu, S.P.M., Mazzolai, B.: A plant tendril-like soft robot that grasps and anchors by exploiting its material arrangement. IEEE Robot. Autom. Lett. 7(2), 5191–5197 (2022)
Meroz, Y.: Plant tropisms as a window on plant computational processes. New Phytol. 229(4), 1911–1916 (2021)
Meroz, Y., Silk, W.K.: By hook or by crook: how and why do compound leaves stay curved during development? J. Exp. Bot. 71(20), 6189–6192 (2020)
Meroz, Y., Bastien, R., Mahadevan, L.: Spatio-temporal integration in plant tropisms. J. R. Soc. Interface 16(154), 20190038 (2019)
Moulia, B., et al.: Posture control in land plants: growth, position sensing, proprioception, balance, and elasticity. J. Exp. Bot. 70(14), 3467–3494 (2019)
Moulia, B., Douady, S., Hamant, O.: Fluctuations shape plants through proprioception. Science 372(6540), eabc6868 (2021)
Moulton, D.E., Oliveri, H., Goriely, A.: Multiscale integration of environmental stimuli in plant tropism produces complex behaviors. Proc. Natl. Acad. Sci. 117(51), 32226–32237 (2020)
Olson, G., et al.: An Euler–Bernoulli beam model for soft robot arms bent through self-stress and external loads. Int. J. Solids Struct. 207, 113–131 (2020)
Rowe, N.P., Speck, T.: Biomechanical characteristics of the ontogeny and growth habit of the tropical liana Condylocarpon guianense (Apocynaceae). Int. J. Plant Sci. 157(4), 406–417 (1996)
Rowe, N.P., Speck, T.: Stem biomechanics, strength of attachment, and developmental plasticity of vines and lianas. Ecol. Lianas 323–341 (2015)
Rowe, N., Isnard, S., Speck, T.: Diversity of mechanical architectures in climbing plants: an evolutionary perspective. J. Plant Growth Regul. 23(2), 108–128 (2004)
Sipos, A.A., Várkonyi, P.L.: The longest soft robotic arm. Int. J. Non-Linear Mech. 119, 103354 (2020)
Vecchiato, G., et al.: A 2D model to study how secondary growth affects the self-supporting behaviour of climbing plants. PLoS Comput. Biol. 19(10), e1011538 (2023)
Vinter, R.B., Vinter, R.B.: Optimal Control. Springer, Berlin (2010)
Wei, Z., Mandre, S., Mahadevan, L.: The branch with the furthest reach. Europhys. Lett. 97(1), 14005 (2012)
Zhou, X., Majidi, C., O’Reilly, O.M.: Soft hands: an analysis of some gripping mechanisms in soft robot design. Int. J. Solids Struct. 64, 155–165 (2015)
Funding
Open access funding provided by Gran Sasso Science Institute - GSSI within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Timothy Healey.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Admissible Trajectories
A Admissible Trajectories
In this section, we prove that any admissible trajectory of problem (\(p_d\)) is an admissible trajectory of problem (\(p_c\)). To achieve this aim, we need the following statements whose proof can be found in Vinter et al. (2010).
Proposition 8
Take a Borel measurable multifunction \(V: \mathbb {R}^m \rightrightarrows \mathbb {R}^n\) and a Lebesgue measurable function \(x: [0,L] \rightarrow \mathbb {R}^m\). Then the multifunction \(F \circ x: [0,L] \rightrightarrows \mathbb {R}^m\) is Lebesgue measurable.
Theorem 6
Consider a closed nonempty multifunction \(G: [0,L] \rightrightarrows \mathbb {R}^n\). Then G is Lebesgue measurable if and only if its graph Gr(G) is a measurable subset of \([0,L] \times \mathbb {R}^n\).
We can now prove the following relation between (\(p_c\)) and (\(p_d\)).
Proposition 9
(x, L) is an admissible trajectory for problem (\(p_d\)) if and only if there exists a measurable (u, L) in \(\mathcal {V}\) such that (x, u, L) is admissible for problem (\(p_c\)). Moreover, \((x^*,L^*)\) is an optimal trajectory for problem (\(p_d\)) if and only if there exists a measurable \(u^*\) such that \((u^*,L^*) \in \mathcal {V}\) and \((x^*,u^*,L^*)\) is an optimal trajectory for problem (\(p_c\)).
Proof
We just need to prove that for any \(x \in W^{1,1}([0,L]:\mathbb {R}^3)\) such that (x, L) is a trajectory of (\(p_d\)) there exist some measurable functions \(u_1,u_2,\lambda \) such that \((x,(u_1,u_2,\lambda ),L)\) is an admissible trajectory for (\(p_c\)). Consider the multifunction \(V: \mathbb {R}^3 \rightrightarrows \mathbb {R}^3\)
that essentially is the one defined in Remark 1. Proceeding as in the proof of Theorem 1, we observe that V is a Borel measurable multifunction. By Proposition 8, the multifunction
is a Lebesgue-measurable multifunction. By definition of V, G is always a closed multifunciton. We can then apply Theorem 6 to observe that the graph Gr(G) is a measurable subset of \([0,1] \times \mathbb {R}^3\). We can than apply theorem the Generalized Filippov Selection Theorem (see for instance theorem 2.3.13 of Vinter et al. 2010) to obtain that the multifunction \(U': [0,1] \rightrightarrows \mathbb {R}^3\) defined by
has a measurable graph. Finally, by the Aumann’s measurable selection theorem (see theorem 2.3.12 in Vinter et al. (2010)) we find a measurable selection of \(U'\) and this concludes the proof. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Vecchiato, G., Palladino, M. & Marcati, P. An Optimal Control Approach to the Problem of the Longest Self-Supporting Structure. J Nonlinear Sci 34, 36 (2024). https://doi.org/10.1007/s00332-023-10011-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00332-023-10011-5