Discrete Adjoint Method for Variational Integration of Constrained ODEs and its application to Optimal Control of Geometrically Exact Beam Dynamics

Direct methods for the simulation of optimal control problems apply a specific discretization to the dynamics of the problem, and the discrete adjoint method is suitable to calculate corresponding conditions to approximate an optimal solution. While the benefits of structure preserving or geometric methods have been known for decades, their exploration in the context of optimal control problems is a relatively recent field of research. In this work, the discrete adjoint method is derived for variational integrators yielding structure preserving approximations of the dynamics firstly in the ODE case and secondly for the case in which the dynamics is subject to holonomic constraints. The convergence rates are illustrated by numerical examples. Thirdly, the discrete adjoint method is applied to geometrically exact beam dynamics, represented by a holonomically constrained PDE.


Introduction
There are two alternative ways to handle an optimal control problem numerically.The so-called indirect methods first derive the necessary conditions for optimality in the continuous-time setting by applying Pontryagin's maximum principle and then discretizing the resulting equations.In contrast, direct methods first discretize the continuous problem, turning it into a finite dimensional one, and then apply a discrete version of Pontryagin's maximum principle.In both cases, one is led to the augmentation of the original objective with the different constraints enforced by Lagrange multipliers.The Lagrange multipliers enforcing the plant (the dynamic equations) of the problem are commonly called adjoint or co-state variables.In the multibody systems literature it is common to refer to this as the adjoint method, and in particular, the discrete adjoint method when considered as a direct method.In this contribution, we apply the discrete adjoint method to optimal control problems with variational integrators approximating the dynamics.In general for direct approaches, the discretization of the ODE governing the dynamics results in a specific discretization of the adjoint variables especially for symplectic methods as e.g.variational integrators [1,2,3].Variational and thus symplectic numerical methods are worthy of consideration as they can benefit the solution of boundary value problems [4].For the optimal control of constrained ODEs, discretizations with conservation properties are of interest as well [5,6].The optimal control of mechanical PDEs, such as string and beam dynamics is an active field of research [7,8,9].The discrete adjoint method has been used for the optimization of flexible multibody systems [10] as well as for parameter identification in rigid body dynamics [11,12].The discrete adjoint method for variational integrators with holonomic constraints is discussed in [13].The discrete adjoint method is derived for a specific discretization of dynamics and this matches the chosen integrator.Therefore, it suggests itself to be applied to integrators that are structure preserving [14].In this work we briefly summarize variational integrators and then show how to derive the discrete adjoint equations for this class of integrators.The basic principles, the derivation of boundary conditions and the discretization of forces are explained.The discrete adjoint method is then extended to variational integrators for holonomically constrained ODEs.The convergence behavior of both methods is investigated with the example of a mathematical pendulum.Finally, the method is applied to the constrained PDE case of geometrically exact beam dynamics.

Discrete Adjoint Method for Variational Integrators 2.1 Variational Integrators
This section illustrates the derivation of the equations of motion for forced systems via variational principles in the continuous and discrete setting [15,2].These equations have to be fulfilled as constraints for the optimal control problem.Consider a Lagrangian mechanical system whose configuration space is the n-dimensional smooth manifold Q.The motion of our system is represented by a curve q : [0, T ] → Q, t → q(t).We denote the velocity of the configuration at time t by q(t) ∈ T q(t) Q.The Lagrangian is a function defined on the tangent bundle of Q, T Q, L : T Q → R. It usually represents the difference of kinetic and potential energy.An external Lagrangian control force is a map f L : T Q × U → T * Q where U ⊆ R l , l ≤ n, is the space of admissible controls.A control is thus a curve u : [0, T ] → U.The total virtual work of such a system vanishes δ T 0 L q(t), q(t) dt + T 0 f L q(t), q(t), u(t) δq(t) dt = 0, ∀δq(t) This is the Lagrange-d'Alembert principle (with controls), which states that the total virtual work evaluated over a physical trajectory of the system q (and a control u) vanishes for all variations δq(t) with fixed end-points δq(0) = δq(T ) = 0.This leads to the equations of motion, the forced Euler-Lagrange equations: This principle is an extension of Hamilton's principle to include non-conservative forces such as control or dissipative forces.A forced variational integrator is derived via the approximation of the action and the virtual work of non-conservative forces and subsequent variation in the discrete setting [15,16,17].The time interval [0, T ] is discretized by N time nodes, we consider a discrete configuration path {q n } N n=0 with q n ≈ q(t n ) with linear approximation of q(t) in [t n , t n+1 ].The approximation of the action integral via the discrete Lagrangian L d and the approximation of the virtual work of non-conservative forces via the left and right side discrete forces f − d and f + d is considered.The input variable is approximated as u n ≈ u(t n ).In each time interval [t n , t n+1 ], the control path The discrete total virtual work vanishes: with δq 0 = δq N = 0.The discrete Lagrange-d'Alembert principle leads to the discrete, forced Euler-Lagrange equations, which are derived via discrete variation and subsequent rearrangement of terms for fixed boundary conditions.The slot derivatives D k denote derivatives with respect to the k-th argument.
for n = 1, ..., N − 1.This equation takes two positions at the current and the previous time node and defines the relation with the next one.Given q n−1 , q n , u n−1 and u n , this equation determines a unique q n+1 provided the discrete Lagrangian is regular, i.e. the matrix The initial conditions are usually defined on T Q as position and velocity or on T * Q as position and momentum, but not on Q × Q as two positions at different points in time.To initialize this time stepping scheme, both a continuous and discrete version of the Legendre transformation are needed.
The continuous Legendre transformation, FL : T Q → T * Q, (q, q) → (q, p = D 2 L(q, q)) connects the Lagrangian and the Hamiltonian formulations of dynamics.It allows us to compute an initial momentum p 0 from an initial configuration and velocity, (q 0 , q0 ).In the discrete setting, the (forced) discrete Legendre transformation defines two distinct maps from the discrete state space to the cotangent bundle, with the left and right side discrete momenta p − n and p + n .These allow us to interpret the discrete Euler-Lagrange equations (6) as a matching of momenta p − n = p + n for n = 1, ..., N − 1.
In order to initialize the algorithm, given a configuration q 0 , a velocity q0 and an initial control u 0 , the relation

Derivation of the Discrete Adjoint Method for Variational Integrators
Similar to the discrete variational principle in Section 2.1, now the discrete adjoint method for variational integrators in ( 6) is derived via a discrete variational principle and the structure and the resulting numerical method for the adjoint equations are illustrated.
Here, we concentrate on a discrete objective J d containing a quadratic Mayer term, where S q and S p are positive semidefinite matrices.The Mayer term is used to relax the enforcement of the end state conditions, (q N , p N ), introducing weights for the reaching of the configuration and the momentum at the last time step N .The discrete adjoint method is derived by augmenting the objective with the variational integrator ( 6) and ( 8) as constraints and by taking variations of the augmented objective [2].
The resulting nonlinear constrained optimization problem reads min subject to: The quantities p 0 and q 0 are prescribed initial conditions at the initial time node.The objective also includes a Lagrange term which is quadratic in the control and R is a positive-definite weight matrix.Equation (10e) defining p N corresponds to the discrete Legendre transformation F + L d (q N −1 , q N , u N −1 ).
REMARK 1: The dependence on q N −1 and q N of the momentum term (10e) of the Mayer term makes it more prone to produce larger contributions than the configuration term.This can make the optimization process unstable and possibly not convergent.
In order to improve this, an iterative approach may be used where the end momentum of the (i)-th iteration, p N , is used to inform the choice of a modified desired end momentum, pN , such that ∥p The procedure can be initialized by considering a first iteration with S p = 0, and ended once ∥p (i) N − p N ∥ is sufficiently small to allow us to substitute pN by p N in a final iteration.
The objective J d is augmented to Jd by the initial conditions and the discrete Euler-Lagrange equations via adjoint variables λ n ≈ λ(t n ) with the discrete adjoint path λ d = {λ n } N −1 n=0 .The indices are chosen such that λ n pairs with the corresponding momenta p ± n .
The discrete variation of the augmented objective δ Jd = 0 has to vanish for variations δu n , δλ n and δq n with boundary conditions δq 0 = 0 that is directly enforced as q 0 = q 0 at the initial time node is specified in problem (10).The variation of the three types of variables leads to three sets of equations.The variation w.r.t the adjoint variables leads to the discrete Euler-Lagrange equations, the constraints in (10).The variation with respect to the configuration variable yields the adjoint equations, reading with rearrangement of terms as follows: The discrete variational principle directly provides the boundary conditions (12a) and (12b) for the two last adjoint variables, as no boundary conditions for the state variables are prescribed at these time nodes.The variation w.r.t. the input u n yields the optimality conditions.Note that the last equation is different: The discrete Euler-Lagrange equations ( 6) can be solved forward in time and the adjoint equations ( 12) backward in time sequentially given the configuration path to determine the discrete adjoint variables as a shooting method while using the input equations ( 13) to update the input.Such a direct shooting algorithm directly uses the equations derived above and thus is simple to implement.However, an appropriately small time step h is necessary for stable integration in both directions in time.The discrete optimization problem with respect to q d , u d and λ d can also be solved by applying an interior point algorithm [18] or sequential quadratic programming [19].In those, the variational integrator is used as equality constraints for the optimization as in (10)

Application of the Discrete Adjoint Method to a Mathematical Pendulum
Let us consider a mathematical pendulum as depicted in Figure 1, in minimal coordinates q = φ with the Lagrangian L(φ, φ) = 1 2 ml 2 φ2 − mgl cos(φ) that is actuated by a torque f = u.The discrete Lagrangian approximated with the midpoint rule is ) with the time step h.The discrete forces are Figure 1: Torque-controlled mathematical pendulum.
We wish to perform an optimal upswing maneuver.Thus, the initial configuration and momentum are ϕ 0 = 0 and p 0 = 0, and the desired end configuration is q N = φ N = π.The end momentum has to vanish p N = 0.The first slot derivatives of the discrete Lagrangian used for the discrete Euler-Lagrange equations are: The time stepping scheme (6) for the configuration is: It is initialized with The second derivatives of the discrete Lagrangian inserted in the adjoint equations (12c) leads to: Two equations according to (12a) and (12b) are necessary to initialize the backward integration (18) in time: a Error of configuration q d versus time step for the mathematical pendulum in minimal coordinates.The equations for the input are: The convergence of the configuration q d and the adjoint variables λ d is illustrated in Figures 2a and 2b, respectively.A simulation time of T = 2 and a constant input of u n = 1, for n = 0, ..., N − 1 is used; the pendulum has a length of L = 1 with a gravitational constant of g = 9.81.The mass of the pendulum is m = 1.For the input weight R = 10 −5 h is used.The weights in the Mayer term are S q = 10 3 and S p = 10 −2 .These values were chosen to obtain solutions that achieve the upswing of the pendulum to the upper equilibrium point, with minimal effort.Larger values for the input weighting lead to solutions with end configuration at the lower equilibrium of the pendulum.The absolute error in these plots is computed using the infinity norm of the difference of the variables and a reference solution, (q ref , λ ref ), which is a simulation with a fine discretization of h = 10 −5 , ∥q d − q ref ∥ ∞ and ∥λ d − λ ref ∥ ∞ , respectively.The convergence rate for the configuration and adjoint variables is equal, we observe second order convergence.This is in accordance to the theoretical results in [2].These convergence results are derived for the forward integration of the time stepping scheme ( 16) and the subsequent backwards solution of (18) using the configuration variables calculated with the same time step width.
The optimized motion of the pendulum is depicted in Figures 3a, 3b, 3c and 3d.The momentum p and the kinetic energy T is close to zero at the end of the simulation with the optimized input acting on the pendulum.REMARK 2: Pontryagin's maximum principle leads to necessary conditions for optimality in the continuous-time setting.The resulting adjoint equations are λT −g/lλ cos φ = 0 and the control equations are Ru + λ = 0.It can be checked that the discrete equations ( 18) and (20a) are the corresponding discrete versions of these equations when discretized using a midpoint rule.The discrete boundary conditions ( 19) and (20b), however, are not so easy to relate to their continuous counterparts.We plan to address this very issue in a future publication.
3 Discrete Adjoint Method for Variational Integration of Constrained Dynamics

Variational Integration of Constrained Dynamics
The derivation of variational integrators for constrained systems that use null space projection and nodal reparametrization [20] is shortly summarized in the following section, using similar steps as in Section 2. The discrete adjoint method for such systems is derived thereafter similar to Section 2.2.
Up until now, we have workd in local coordinates directly on the configuration manifold Q.However, it can be advantageous to consider Q an ambient (vector) space parametrized by redundant coordinates, and constrain the motion by constraints.Given a scleronomic, holonomic constraint function g : We assume that the Jacobian ∂g ∂q has full rank m, so the dimension of the constraint manifold is n − m, the number of degrees of freedom of the mechanical system.We also assume consistent initial conditions (q 0 , q0 ) that fulfill the constraints on configuration level g(q 0 ) = 0 as well as on velocity level d dt g(q 0 ) = ∂g(q 0 ) ∂q q0 = 0.A Lagrange multiplier ν is used to enforce the constraint by appending the term −g(q) T ν to the Lagrangian in the action integral.Thus, the Lagrange-d'Alembert principle in this setting reads 0 = δ T 0 L q(t), q(t) − g(q) T ν dt + T 0 f L q(t), q(t), u(t) δq dt, ∀δq, δν with δq(0) = δq(T ) = 0.The constraint part of the action integral is approximated with the trapezoidal rule: with g d (q n ) = hg(q n ).Including this in the discrete variational principle in (5), in the constrained case, the discrete variational principle the variation of the discrete action sum with the variations δq n and δν n and δq 0 = δq N = 0 with subsequent rearrangement of terms leads to the discrete, constrained Euler-Lagrange equations of dimension n + m.To reduce the dimension of (24a) from n to n − m and eliminate the Lagrange multipliers, thus avoiding conditioning problems related to these, a discrete null space matrix P (q n ) ∈ R n×(n−m) , with columns spanning the tangent space T qn M, can be applied that only depends on quantities at the current step such that the constraint forces are eliminated.Further, a nodal reparametrization q n+1 = F d (q n , v n+1 ) with v n+1 ∈ V ⊆ R n−m is then used to eliminate the constraints as g(F d (q n , v n+1 )) = 0, ∀v n+1 ∈ V, for n = 0, ..., N −1.Together with the null space matrix, the reparametrization F d : V × Q → M leads to the integration scheme that has to be iteratively solved for v n+1 in each time step, given q n−1 , q n , u n−1 and u n .
The redundant control forces f (q, u) = B T (q)τ (u) ∈ R n depend on the generalized control forces τ (u) ∈ R n−m and the input transformation matrix B T (q) ∈ R n×(n−m) that must be chosen such that the consistency with the constraints and consistency of momentum maps is ensured [6].The discrete approximations of the redundant forces capture the effect of the generalized forces acting on the time [t n , t n+1 ].We have assumed that u is approximated constant in each time interval.

Derivation of the Discrete Adjoint Method for Variational Integration of Constrained Dynamics
The constrained setting with null space projection and reparametrization for a mechanical system leads to implicit equations of minimal dimension.The discrete adjoint method applied to such a system leads to adjoint variables of minimal dimension n − m.It also involves the null space projection for the adjoint equations.The starting point is a problem such as in equation (10), but now constrained by the discrete Euler-Lagrange equations for the constrained system with null space projection and nodal reparametrization (25) as in [6].Similar to the procedure outlined in Section 2.2, the objective is augmented with the discrete Euler-Lagrange equations.As these equations are defined on M using the nodal reparametrization, q n+1 = F d (q n , v n+1 ), the adjoint variables are of the same dimension as v n+1 .An objective J d consisting of a Mayer term and an integral term quadratic in the control, similar to the discrete adjoint method for systems without constraints in Equation ( 10) is considered: However, to simplify matters, the Mayer term of the momentum has been omitted since it can be handled similarly as in the unconstrained case.The variation of the objective, δJ d , with respect to all variables δλ n , δu n and δv n+1 at all time steps has to vanish.The variation of the redundant configuration δq n with respect to the minimal coordinate δv n reads The Jacobian matrix ∂F d ∂vn is a null space matrix [21].After applying this relation, the adjoint equations become The variations with respect to the input variables vanish, if holds.The evaluation of these equations can be used to update the input variables in a shooting method.
b Error of configuration q d versus time step for the mathematical pendulum as constrained system.
Figure 4: Error of the configuration q d and adjoint variable λ d .

Discrete Adjoint Method for a Mathematical Pendulum described as Constrained System
The mathematical pendulum is described as a constrained system in the ambient space Q = R 2 with redundant coordinates q = [x y] T and the constraint equation g(q) = 1/2(x 2 + y 2 − l 2 ).The null space matrix is , the generalized force is τ (u) = u, and the nodal reparametrization reads The input variable can be interpreted as the physical torque and the variable v as the incremental angle.The Figures 4a and 4b show the convergence results for the pendulum in the constrained case.The adjoint variables are of minimum dimension (n − m) just as the configuration variables.The error is calculated in the same way as for the unconstrained case in Section 2.3 as infinity norm of the difference to the reference trajectory using the same parameters.These errors are determined with solutions obtain via forward timestepping for the configuration and backward timestepping for the adjoint variables with fixed input.It can be observed in the figures that also in the constrained case the convergence rate is quadratic.However, note that the theoretical results in [2] only consider the case in minimal coordinates and not the constrained case.
The optimized motion of the pendulum is depicted in Figures 5a, 5b, 5c and 5d.The input u and the kinetic energy T are close to zero at the end of the simulation with the optimized input acting on the pendulum.The end configuration is weighted with S q = 10 3 , the end momentum weight is S p = 10 −2 .The weight for the input is R = 10 −5 h.This low weight for the input is chosen to reach the upper equilibrium position of the pendulum.It reduces the input from a constant initial guess of 1 as well as regularizing the  The results are similar to those obtained previously by the pendulum in minimal coordinates.Small differences in the solution are visible, but show a similar optimized result.

Discrete Method for Geometrically Exact Beam Dynamics
In this section, the discrete adjoint method is applied to an optimal control problem involving dynamics of a geometrically exact beam being approximated via the multisymplectic integrator found in [22].

Geometrically Exact Beam Model
The geometrically exact beam [26] models a rod-like deformable body as a curve x(t, s) ∈ R 3 , with a rigid cross section attached to each of its points.Here t ∈ [0, T ] is used again to parametrize time, while s ∈ [0, ℓ] parametrizes the longitudinal position along the curve.The orientation of the cross section at s is described by a rotation R(t, s) ∈ SO(3).
When considered as a collection of columns, R(t, s) = [d 1 (t, s), d 2 (t, s), d 3 (t, s)], the triad of vectors are known as the directors of the cross section.
This can be considered as a Lagrangian field theory with configuration space Q = R 3 × SO(3).This space is diffeomorphic to the group of special Euclidean transformations in 3D, SE(3), to which it differs only in terms of group structure.In [22,27], the authors claim it to be numerically more advantageous to consider this latter space.
If g(t, s) = (R(t, s), x(t, s)) ∈ SE(3) denotes the configuration of a cross section, its derivatives with respect to t and s are related to velocities and strains respectively.More specifically, (Ω, V ) = g −1 ġ = (R −1 Ṙ, R −1 u) body angular and linear time derivatives

body angular linear derivatives
where we have used Ẋ = ∂X ∂t and X ′ = ∂X ∂s and "body" is meant to signify "in the reference frame of the section itself".Considering a reference configuration g ref (s) ∈ SE(3), we also define the strains The simple case of a straight initial configuration along the e 1 axis, g ref (s) = (I, se 1 ), where I is the identity matrix, leads to Λ = K and Γ = W − e 1 .One can see that Λ measures the curvature (bending and torsion) and Γ measures the difference between d 1 and x ′ (elongation and shear).
Considering a hyperelastic material model with moderate strains, the Lagrangian density of the system can be written as where ρ > 0 is the linear density of the beam, U ext : SE(3) → R is an external potential function and J = ρ diag([J 1 , J 2 , J 3 ]) is the matrix of moments of inertia of the sections in the body frame.Assuming uniform cross sections and directors d 2 and d 3 coincident with the principal moments of area I 2 and I 3 , one gets that J 1 = ρ (I 2 +I 3 ), J 2 = ρI 2 and J 3 = ρI 3 , and , which are the matrices representing the corresponding stiffness parameters of the sections.κ 2 and κ 3 are possible shear correction factors.

Unit dual quaternion formulation
Working on SE(3) is difficult.In [22] the authors propose the use of a constrained approach where the space of dual quaternions, H, which is a vector space, is considered as ambient manifold and the unit dual quaternions, H 1 , as constraint submanifold since it is well-known that this latter space provides a double covering of SE(3).
The space of dual quaternions is defined by where ϵ is the so-called dual unit and is the space of quaternions.Both of these are vector spaces, so working with them is quite simple.
Similar to complex numbers, a conjugation operation is defined on the space of quaternions, namely, if p = p 0 + p 1 i + p 2 j + p 3 k, then p = p 0 − q 1 i − q 2 j − q 3 k, and this operation is inherited by dual quaternions.This defines a norm on H, ∥p∥ = √ pp and lets us write the inverse of p as p −1 = p/∥p∥ 2 .This also defines a seminorm on , where in the last equality we consider the quaternions q r , q ϵ as vectors in R 4 .The set of unit quaternions and unit dual quaternions are thus, H 1 := {q ∈ H | ∥q∥ = 1} and H 1 := q ∈ H | ∥q∥ = 1 respectively.More explicitly, the latter implies q 2 0,r + q 2 1,r + q 2 2,r + q 2 3,r = 1 (31a) q 0,r q 0,ϵ + q 1,r q 1,ϵ + q 2,r q 2,ϵ + q 3,r q 3,ϵ = 0 (31b) Figure 7: Spacetime grid for the multisymplectic variational integrator with relative indices marked in red.
As stated before, an element q ∈ H 1 can be put into correspondence with an element of SE(3).In particular, we can parametrize q by a rotation angle θ and two purely imaginary quaternions n, x, i.e. n 0 = x 0 = 0, with ∥n∥ = 1, representing a rotation axis and a three dimensional translation respectively.This way q = cos(θ/2) + n sin(θ/2) and q = q + 1 2 xqϵ.If q(t, s) ∈ H 1 , then One can thus define an ambient Lagrangian in the dual quaternions, L q, q, q′ = 2 M q q, q q − 2 C q q′ − qref q′ ref , where ), and α i ∈ R.These α can be chosen arbitrarily as they play no role in the dynamics once the unity constraints (31) are enforced.

Discrete Lagrangian
In order to discretize the beam, the spacetime [0, T ] × [0, ℓ] is discretized into a regular grid (see Figure 7) with constant space and time steps, ∆s and ∆t respectively.
We discretize the ambient Lagrangian density (32) applying the trapezoidal rule in both space and time a+1 ) to simplify the formulas.As derived in [22], the discrete constrained Euler-Lagrange field equations are derived via a discrete variational principle in space and time and subsequent rearrangement of terms in space index a and time index n.As shown there, a natural choice of null space matrix is The forced version of these equations results from the application of the discrete Lagrange-d'Alembert principle, similar to (5), a+1 , u n a ) denoting all external and control forces, and i = 1, ..., 4 coinciding with the corresponding relative node on which they are applied, as in Figure 7.This leads to DEL with a force contribution from each adjacent spacetime rectangle sharing the node under consideration: Suitable boundary conditions in space and time as well at the spacetime corners are directly derived via the discrete variational principle.Kelvin-Voigt type viscous damping is included as external forces that are proportional to the discrete approximation of the strain rate [23] with bulk viscosity ζ and shear viscosity η.In the moderate strain regime these result in a damping matrix D = diag([0, η(I 2 + I 3 ), χI 2 , χI 3 , 0, χA, ηA, ηA]), where χ the extensional viscosity.The corresponding discrete force is where by L T q , we denote the transposed of the dual quaternion left multiplication operation by q, and K n a = K(q n a , (q ′ ) n a ), with (q ′ ) n a = (q ′ ) n a+1 = (q n a+1 − qn a )/∆s and (q ′ ) n+1 a = (q ′ ) n+1 a+1 = (q n+1 a+1 − qn+1 a )/∆s.Figure 8 shows the position of the tip of a cantilever beam with fixed-free boundary conditions that is initially straight under gravity.The strain-rate proportional damping leads to reduced high frequency oscillations.

The Discrete Adjoint Method in Spacetime
The discrete adjoint method for the geometrically exact beam considers the configuration variables as well as the adjoint variables in space and time to derive the discrete adjoint equations in space and time.Single shooting in time while simultaneously solving the equations in space is used for the solution of the optimal control problem.The Barzilei-Borwein gradient method [24,25] is used for the update.Here, a pendulum-like beam subject to gravity and fixed-free translation and free-free rotation boundary conditions is considered with a torque u applied at the fixed end as discrete redundant control forces Since our control is only applied at the boundary, this is a boundary control problem for a PDE.The desired configuration is the upright rotated position of the beam, specified for each node in space.The final position considered is undeformed.The desired maneuver is from the lower position to the upright position in such a way that the inertial terms cancel the strains in the end configuration.As the system is heavily underactuated, the chosen input does not allow us to control the motion in the axial direction and does not lead to a stationary upright position.Hence, no end momentum is imposed.Nonetheless, the control task should demonstrate the presented method in an academic example that resembles the previous pendulum examples sufficiently.
Our optimal control problem is of the form min qd ,u d subject to: for a = 1, ..., A − 1 for n = 1, ..., N − 1 where (q 0 a ) * , (u 0 0 ) * , (p 0 a ) * , are given initial discrete values and (q N a ) * denotes the discretised desired end configuration.(p 0 a ) * are discrete initial temporal momenta, canonically associated to discretised linear and angular velocities, which in our example are set to 0.
The adjoint equations are obtained similar to the constrained temporal case by applying discrete variational calculus and nodal reparametrization, but now in space and time.However, the resulting equations are quite long, and so, will not be reproduced here in their entirety.For instance, the equations obtained by taking variations of the inputs at the fixed boundary a = 0, are 0 These are used to update the torque.If instead of boundary control we had controls over the bulk, these equations would generalise to all nodes as:

Fairly rigid Beam
The fairly rigid beam demonstrates the sequential optimization of the beam dynamics with objective minimization of the control effort.The simulation of the beam dynamics uses A = 10 nodes in space and N = 3000 nodes in time.The beam has a length of L = 1.The simulation duration is T = 1.The resulting time step is h =1 3000 in time and the step size in space is ∆s = 1  10 .A constant initial guess of u 0 = 1500 is used.The beam has a square cross-section of A cross = 0.01 with a side length of l s = 0.1.The chosen weighting for the end term is S q = 10 8 , and R = 10 −2 for the input 1 .The material of the beam is fairly rigid with a Young's modulus of E = 210000 and a Poisson ratio of ν = 0.3.The mass density is ρ = 7.85.The beam is damped with η = 1 • 10 −1 and ζ = 1 • 10 −2 .Figure 9a shows snapshot of the motion of the beam.Figure 9b shows the total energy H as well as all its contributions over time.The deformation energy is the difference between the total potential energy of the system U and the gravitational potential energy U grav .The main contribution to the kinetic energy T is due to translation.At the end of the simulation, the kinetic energy reduces due to the input weight.The optimized input is depicted in Figure 10a, it decreases to zero at the end of the simulation time.The optimized quantities, the distance of the beam to the desired end configuration as well as the control effort are depicted in Figures 10b and 10c, respectively.The gradient depicted in Figure 10d shows heavy oscillations.

Very flexible Beam
A very flexible beam demonstrates the sequential optimization for more flexible beams that show larger deformations and are therefore harder to control.The simulation of the beam and adjoint dynamics uses A = 5 nodes in space for a length of L = 1.This results      The weighting for the end configuration is S q = 10 2 .For this numerical experiment, the input weight was set to R = 0, since the chosen end configuration gets increasingly harder to reach for more flexible beams.
The optimization results are depicted in Figure 11.The input in Figure 12a is increased compared to the initial guess.In addition, oscillations are present.The gradient depicted in Figure 12b shows oscillations with high frequency that are likely caused by the dynamics of the beam in normal direction as these deformations are of much higher frequency than bending deformations due to the difference in stiffness.The objective is depicted in 12c.The largest decrease happens at the start of the optimization.Figure 12d depicts the total energy and its parts.During the optimization, mainly the translational part of the kinetic energy increases as well as the potential energy due to the gravitation.

Summary
The discrete adjoint method for variational integration of (constrained) ODEs is derived, and its convergence properties are demonstrated with the help of numerical examples.Quadratic convergence results of the configuration variables as well as for the adjoint variables based on simulations of a mathematical pendulum are observed.The discrete adjoint method is also applied to the multisymplectic Galerkin Lie group integrator for geometrically exact beam dynamics, in particular to the optimal control of the upward    motion of a pendulum-like beam.The discrete adjoint method directly derives fitting equations at the boundary based on the discretization chosen for the variational integrator.The discrete adjoint method for constrained systems with null space projection and nodal reparametrization also directly results in the null space projection of the discrete adjoint equations.The properties of the discrete adjoint method applied to structure preserving integrators have to be analyzed further as to understand the connection in a more general setting.

b
Error of adjoint variables λ d versus time step for the mathematical pendulum in minimal coordinates.

Figure 2 :
Figure 2: Error of the configuration q d and adjoint variable λ d .

Figure 3 :
Figure 3: Optimization results for the pendulum, using the discrete adjoint method and single shooting.

Figure 5 :
Figure 5: Optimization results for the pendulum in the constraint setting, using the discrete adjoint method and single shooting.

Figure 6 :
Figure 6: Configuration of a geometrically exact beam.

b
Energy of the beam for the optimized motion over time t.

Figure 9 :a
Figure 9: Motion of a fairly rigid beam and its energy.

d
Gradient of the augmented objective with respect to the input ∂ J ∂un over time t.

Figure 10 :
Figure 10: Optimization results of beam dynamics using single shooting for a fairly rigid beam.

ab
Initial configuration of the beam by example of a very flexible beam.Deformation of a very flexible beam during its motion.

Figure 11 :
Figure 11: Snapshot of a very flexible beam at n = 0 as well as during the upward movement of a very flexible beam.

a
Input un over time t.

b
Gradient of the augmented objective with respect to the input ∂ J ∂un over time t.
beam for the optimized motion over time t.

Figure 12 :
Figure 12: Optimization results of beam dynamics of a very flexible beam.