On the use of adjoint gradients for time-optimal control problems regarding a discrete control parameterization

In this paper, we discuss time-optimal control problems for dynamic systems. Such problems usually arise in robotics when a manipulation should be carried out in minimal operation time. In particular, for time-optimal control problems with a high number of control parameters, the adjoint method is probably the most efficient way to calculate the gradients of an optimization problem concerning computational efficiency. In this paper, we present an adjoint gradient approach for solving time-optimal control problems with a special focus on a discrete control parameterization. On the one hand, we provide an efficient approach for computing the direction of the steepest descent of a cost functional in which the costs and the error in the final constraints reduce within one combined iteration. On the other hand, we investigate this approach to provide an exact gradient for other optimization strategies and to evaluate necessary optimality conditions regarding the Hamiltonian function. Two examples of the time-optimal trajectory planning of a robot demonstrate an easy access to the adjoint gradients and their interpretation in the context of the optimality conditions of optimal control solutions, e.g., as computed by a direct optimization method.


Introduction
Improving the performance of mechanical systems requires sophisticated optimization strategies to fulfill the high demands of current and future product requirements.In general, two problem formulations can be considered to describe various optimization applications: structural optimization of mechanical components and/or finding an optimal control for nonlinear dynamical systems [37].The focus of this paper is on the latter problem and relates particularly with time-optimal controls.Bobrow et al. [4] solved the time-optimal control problem for the case where the path is specified and the actuator torque limitations for the robot control are known.They computed the optimal control torques using a conventional linear feedback control system.Shin and McKay [35] used dynamic programming considering parametric functions to reduce the state space.Regarding the application to industrial robots, smooth trajectory planning is essential and has been presented, e.g., by Constantinescu and Craft [8] by using cubic splines to parameterize the state-space trajectory.Reiter et al. [31] proposed a time-optimal path tracking problem formulated as a nonlinear programming (NLP) problem solved by a multiple shooting method to account for the continuity required to respect the technological limits of real robots.Moreover, methods for the fast computation of optimal solutions to planning problems with changing parameters based on B-spline parameterization are presented in [30].
An alternative to the methods mentioned above is the use of indirect optimization methods avoiding the solution of a boundary value problem suffering from poor initial controls.The fundamental work by Bryson and Ho [6] shows how the gradient in an indirect optimization approach can be computed in a straightforward manner using adjoint variables.Optimal control problems can be solved iteratively with adjoint gradients using nonlinear optimization routines, as described in the sense of optimal control or parameter identification in multibody systems, e.g., in [23].
The adjoint method has been used by various authors [2,14,17] in different research areas.In the last few decades, the sensitivity analysis based on the adjoint method has become increasingly important [7,26].An extensive literature review in a more recent work [15] presents various gradient-based optimization methods, especially in design optimization of flexible multibody systems.
Moreover, since the adjoint method is computationally efficient, real-time applications and neural network applications can be addressed with the adjoint approach.For instance, physics-informed neural networks use partial differential equations in the cost functions to incorporate prior scientific knowledge.Previous research has shown that the discrete adjoint approach is efficient in application of neural networks [29].Solvers capable of building efficient gradients are beneficial for training machine learning embedded cost functionals.Moreover, Gholami et al. [12] proposed an adjoint-based neural ordinary differential equation framework that provides unconditionally accurate gradients.Johnston and Patel [18] stated that adjoint methods are used in both control theory and machine learning to efficiently compute gradients of functionals.
In this paper, we concentrate on the efficient computation of gradients in optimal control problems and the role of adjoint variables in the optimization strategy.The proposed ideas can also be used in neural network approaches for the interpretation and evaluation of optimal solutions.In particular, we discuss a particular class of time-optimal control problems for dynamic systems involving final constraints.For such problems, Eichmeir et al. [10] recently presented an indirect gradient method to relate the control variations to the error of the given final constraints.In this paper, we extend the method to provide an exact gradient for discrete control parameterizations for direct or indirect optimization methods.For both methods, the Hamiltonian of the system can be determined and can then be considered for classical statements about optimality.To be more precise, we investigate the role of the adjoint variables in the verification of the optimality conditions of a solution derived by an arbitrary optimization approach, e.g., computed by a direct optimization method.
To this end, we discuss the time-optimal control problem of a two-arm robot.In a first example, the robot is formulated with rigid bodies.The advantage of the proposed approach, in particular, concerning computational effort, will be exploited in a second example of a flexible robotic system using the absolute nodal coordinate formulation (ANCF) for describing large deformations.Both examples are solved by a direct optimization method, and the evaluation of the optimality criteria regarding a Hamiltonian leads to an interpretation of the adjoint gradients.In addition, we performed a comparison of the number of function evaluations in a local minimum considering either numerical or analytical gradients, which shows the clear advantage of using adjoint gradients.

Time-optimal control problem
The aim of an optimal control problem is to find a control of a dynamical system to minimize certain performance measures.Let us consider the nonlinear dynamical system where x(t) ∈ R n and u(t) ∈ R m are the vectors of state and control variables, respectively.A performance measure might be the energy consumption or the operation time from initial to final state.The focus in this work will be on the latter.We briefly discuss the time-optimal control problem [3,6,21].The goal is to determine a final time t f = t * f and a control u(t) = u * such that the scalar cost functional is minimized while satisfying the final constraints Note that the state variables x(t) and control u(t) are functions over the time interval t ∈ [t 0 , t f ] in the cost functional and that inequality constraints for the state variables and control are accounted for using the penalty function P (x, u).This means that a penalty term will be added to the final time t f in case of violating inequality constraints; see Eq. ( 2).The optimal control problem is defined by Eqs. ( 1)-( 3) and can be solved by two different approaches.Direct methods transform the original infinite-dimensional optimization problem into a finite-dimensional one.The resulting NLP problem can be treated by well-known methods, e.g., the sequential quadratic programming (SQP) approach [24].An optimal point is in accordance with the Karush-Kuhn-Tucker (KKT) conditions [19,22], which are necessary optimality conditions for direct optimization methods.Contrary, indirect methods are based on the calculus of variations and the derived necessary optimality conditions.These explicit expressions result from Pontryagin's minimum principle and are in the form of a two-point boundary value problem, which is usually hard to solve.Shooting methods, collocation methods, or gradient-based approaches can be used to solve the underlying two-point boundary value problem.

Direct optimization method
Direct methods transform the original infinite-dimensional optimization problem into a finite-dimensional one by a parameterization of the control u(t).In this paper, the control is discretized by a set of grid nodes ū.Hence the set of optimization variables in time-optimal control problems is defined by z T = t f , ūT ∈ R z .The discretization of the control leads to an NLP problem of the form min z J (z) Note that the NLP problem is defined by the optimization variables z.To evaluate the cost functional and final constraints, the state variables x have to be computed with respect to the optimization variables z.For solving the state equations in this step, a classical ordinary differential equation (ODE) solver can be used, e.g., an explicit Runge-Kutta solver.The NLP problem in Eq. ( 4) can be solved by classical direct optimization methods.The local optimality of direct methods is investigated by introducing the Lagrangian function where ξ ∈ R q is the Lagrange multiplier.The necessary first-order or KKT conditions of the optimization problem in Eq. ( 4) are formulated as One of the most powerful methods for finding a KKT point z * , ξ * of the NLP problem is an SQP approach [16,25,28].The basic idea of this approach is to replace the original NLP problem with a quadratic subproblem.The solution of this subproblem is then used in an iterative method to determine a KKT point satisfying Eq. ( 6).The quadratic approximation of the cost function and the linear approximation of the constraints lead to min where d k = z − z k is the minimizer of the quadratic subproblem.In analogy to Eq. ( 6), the KKT conditions regarding the quadratic-linear model result in the linear system which is called the KKT system.The solution of this system is used to update the optimization variables for the (k + 1)th iteration.The method is made more robust by using a step length control parameter α k .The factor is obtained by minimizing a proper merit function.For a detailed description and different variants of the basic SQP approach, we refer to [3,24], which is quite standard and implemented in various codes in this way.One goal of the present paper is to derive adjoint gradients providing analytically given gradients for an SQP approach.
Hence the following section is devoted to the derivation of analytically computed adjoint gradients of the Jacobians in Eq. ( 8).This approach is especially efficient in the case of a high number of optimization variables.Moreover, the indirect approach is used for the evaluation of the chosen parameterization of the control.The parameterization of the control is correctly chosen with respect to the problem formulation if the necessary optimality conditions based on indirect methods are sufficiently small; see Fig. 2 for a graphical validation in the scope of an example.

Adjoint gradient approach
In this section, we exploit the adjoint gradient approach to evaluate the necessary first-order conditions regarding the Hamiltonian function in an indirect optimization method.For this purpose, we briefly summarize the key idea and theory for determining time-optimal controls for dynamic systems regarding final constraints using an indirect method.Generally, indirect methods are based on optimality conditions and lead to solving the underlying twopoint boundary value problem.Instead of solving the two-point boundary value problem, the optimization problem can also be addressed by a gradient-based method, e.g., the Kelley-Bryson method [5,20].The key idea is to find a variation of the control to produce the maximum local decrease of the cost functional.To find a minimum of the cost functional, we can simply walk a short distance along the negative gradient of the cost functional.We pursue this method to fulfill the necessary optimality conditions derived by the calculus of variations following the basic ideas in [3,6,21].

Combined adjoint gradient approach regarding final constraints
The adjoint gradient computation in the presence of final constraints has been presented in a recent work [10], and here we briefly summarize the main steps.To derive the adjoint and influence equations, the cost functional in Eq. ( 2) is extended by the state equations in Eq. ( 1) leading to for any choice of the adjoint variables p ∈ R n in the cases where the state equations are satisfied.To compute the variation of the extended cost functional J , we perform infinitesimal variations of the final time δt f and of the control δu, which also cause a variation of the states δx due to the state equations.The resulting variation of J leads to a variation δ ẋ, which can be eliminated by integration by parts.Finally, the cost functional can be reduced to (11) in the case where the adjoint variables fulfill the linear time-variant final value problem which is solved backward in time.Note that to compute the adjoint system, the state equations must first be solved forward in time.The subscripts u and x denote the partial derivatives with respect to u and x, and the subscript f indicates the evaluation of quantities at time t = t f .The optimization problem in Eqs. ( 1)-( 3) does not only consist of minimizing the cost functional, but in addition, the final constraints must be fulfilled.The variations of the cost functional and final constraints will lead to a combined gradient-based approach for updating the final time and controls.As a first step, the final constraints are extended with the state equations, leading to Proceeding exactly the same way as above, the variation for the extended final constraints is defined by where φf denotes the total time derivative of Eq. ( 3) evaluated at time t = t f .The influence adjoint variables R ∈ R n×q have to fulfill the matrix differential equation in which one set of n ordinary differential equations for each component of the final constraints φ(x(t f ), t f ) = 0 is solved backward in time.
A linear combination of the scalar δ J and vector δ φ needs the introduction of vector multipliers ν ∈ R q resulting in where ν is a vector of multipliers to combine both sets of adjoint variables and is computed in such a way that the variations in the control and final time lead to a better approximation of the final constraints.The largest possible decrease of the combined variation is obtained if the variations δu and δt f are chosen in the appropriate descent directions in Eq. ( 16) for the optimal updates u new = u + δu and t f, new = t f + δt f .These updates always reduce the cost functional and final constraints within one iteration.Moreover, the Hamiltonian for the time-optimal control problem in Eqs. ( 1)-( 3) was formulated by in which λ(t) = p(t) + R(t)ν exploits the decoupling of boundary conditions of the state and adjoint equations by introducing a set of adjoint variables p and the so-called influence adjoint variables R for q final constraints.The decoupling within the multiplier λ enables sequential integration of a new set of canonical equations forward and backward in time, depending on a putative optimal control history.The solution of the canonical (adjoint and influence) equations for p and R can be combined to determine the Hamiltonian in Eq. (17).
A more elaborate derivation can be found in [10], supplemented with examples and convergence analysis.

Parameterization of control
The following adjoint gradient approach allows the consideration of final constraints and enables the use of various control parameterizations.Hence the infinite-dimensional optimal control problem is reduced to a finite-dimensional problem to reduce the complexity of the optimization task.As proposed for instance in [10], the variation δu can be defined as an explicit function in time by and only depends on the time discretization of the forward and backward solution.It has to be emphasized that this update δu is not yet dependent on the type of parameterization.To meet the restrictions of industrial applications, feasible control parameterizations have to be specified.However, the control function u(t) can be parameterized by grid nodes ū (see the Appendix for a cubic spline parameterization of the control) in the form leading to the variation of the control It has to be emphasized that any parameterization according to Eq. ( 19) can be used for the proposed gradient approach.Following the theory introduced in Section 2.2.1, the variation in Eq. ( 20) is inserted into Eq.( 16).Since the variation δ ū does not depend on time, we can rewrite the combined variation in the following form: Similarly to the previous section, the variations of the grid nodes δ ū and final time δt f are defined by so that the combined variation leads to the largest possible decrease.The scalar κ is a sufficiently small step size, and the factor α can be used as a scaling factor if the magnitudes of the two latter variations differ dramatically.The choice of the vector of multipliers ν is based on the reduction of the final constraints in every iteration with the updates δ ū and δt f , i.e., choosing an appropriate update parameter ε > 0, as shown in detail in [10].This approach finally results in where we use the abbreviations This combined gradient approach enables the use of various parameterizations of the control, shown, e.g., in [10] for a bang-bang control.This paper investigates the combined gradient approach to evaluate necessary optimality conditions regarding the Hamiltonian function.
In case of using a direct optimization method, the corresponding analytically adjoint gradients can be used instead of numerically computed gradients because of the computational burden.To derive these adjoint gradients, the discrete control parameterization from Eq. ( 20) is inserted into the variation of the cost functional in Eq. (11) and into the variation of the final constraints in Eq. ( 14).Hence the analytically adjoint gradients are given by and can be used instead of the numerical gradients in Eq. ( 8) in a direct optimization method.Note that the analytical gradients of J and φ with respect to t f cannot be simply given in a proper form and are therefore computed numerically in this case.It has to be emphasized that the size of the adjoint system does not increase with a larger number of grid nodes, which is not the case in the direct differentiation method or a numerical differentiation method.If (forward or backward) numerical differentiation is used, then the equations of motion have to be solved (1 + z) times to evaluate the numerical gradients, where z is the number of optimization variables.Hence, the number of forward simulations grows linearly with the number of optimization variables.Note that the variables κ, α, and ε are introduced here in accordance to the prework paper [10] but are not relevant in the discussion of the present paper, and therefore the variables are set to κ = α = ε = 1.In this paper, the gradients in Eq. ( 28) and ( 29) are provided to a direct optimization method without using any user-defined scaling factors.Furthermore, the presented approach provides a new view of adjoint gradients in terms of the evaluation of optimality criteria using the Hamiltonian function.

Interpretation of the adjoint variables for optimality conditions
The decoupling of the gradients in Section 2.2 can lead to a new perspective on the adjoint variables in the evaluation and interpretation of Pontryagin's principle.Bryson and Ho [6] derive the necessary optimality conditions using the calculus of variations.The minimization of the Hamiltonian must be addressed for restricted optimal controls problems according to Pontryagin's minimal principle [21].In this paper, we state the Hamiltonian as H(x, u, λ) := 1 + P (x, u) + λ T f(x, u), (30) in which the multiplier λ is computed by a linear combination of the adjoint variables Using the introduced Hamiltonian, we can formulate the necessary optimality conditions for time-optimal control problems.The derived variation in Eq. ( 22) can be rewritten in terms of the Hamiltonian as where the partial derivative of the Hamiltonian with respect to the grid nodes ū is given by In a similar manner, the variation in Eq. ( 23) can be reformulated as where the total time derivative of the final constraints is given by Note that the partial derivative of the constraints with respect to the states is expressed in terms of the influence adjoint variables given in the final condition in Eq. (15).In this paper, we use the adjoint gradients within the optimality conditions for other optimization methods relating to a Hamiltonian function.An optimal control parameterized by u = C ū must satisfy the necessary optimality conditions fulfilling the boundary conditions x(t f ) = x f , (41) However, since the defined Hamiltonian is an autonomous system, d dt H (x * , u * , λ * ) = 0 only if an infinite number of grid nodes is used.In addition to the optimality conditions in Eq. ( 36)-(39), the optimal control history u * can be evaluated in the particular case where the control appears linearly in the underlying state equation by introducing the switching function [27] Following Pontryagin's principle, three cases can be observed to express the optimal control where an infinite number of grid nodes ûi is assumed, and the dynamic behavior of the control is of the bang-bang type.Hence Eq. ( 43) and ( 44) can be used in a postprocessing step to relate the optimal control u * i generated by a direct method with the switching functions h i to evaluate the solution in terms of an indirect method.If an infinite number of grid nodes is used, then the roots of the switching function exactly match the switching times of the control.Note that the optimization results of the direct approach may yield to so-called singular intervals, where the switching function is zero for a finite time interval, and Pontryagin's principle does not provide any information on the optimal control.However, gradient-based optimization methods are appropriate to identify the control history of singular intervals [9].
5. Compute the adjoint gradients of the cost functional and the final constraints 6. Use these analytical gradients for solving the constrained optimization problem by a direct method, e.g., as shown in Section 2.1 by an SQP approach.Repeat steps (2)-( 5) until the KKT conditions are satisfied and hence until an optimal solution z * T = t * f , ū * T is found.7. The states x * according to the optimal control grid nodes ū * have to be computed for evaluation of the Hamiltonian in Eq. (30).Moreover, the corresponding λ * is obtained based on Eq. ( 31). 8. Finally, the optimality conditions in Eqs. ( 36)-(39) can be evaluated in terms of the Hamiltonian function.

Numerical examples
To show the use of the adjoint gradient approach in a direct optimization method and to evaluate the optimality conditions in a typical time-optimal control problem, we present two examples of a Selective Compliance Assembly Robot Arm (SCARA) in a rest-to-rest motion.The serial robot consists of two bodies connected with revolute joints and an additional mass is attached to the tool center point (TCP).Firstly, we only consider structural components that are modeled as rigid bodies.Secondly, the rigid bodies are replaced with flexible components.For both examples, the goal is to identify controls u * 1 and u * 2 (m = 2) with a continuity requirement up to C 2 such that the TCP moves from a prescribed initial state to a final state in minimal time.Minimizing the cost functional leads to the shortest operation period t * f with respect to physical bounds of the controls, i.e., −u i,max ≤ u i ≤ u i,max .The physical bounds are considered with penalty approach P (u 1 , u 2 ) := μ 1 P 1 + μ 2 P 2 , where the penalty function is given by To ensure a continuity requirement up to C 2 of the control u and to transform the original infinite optimization problem to a finite dimensional one, the interpolation scheme in Eq. (81) in the Appendix is used to parameterize the control.Hence the vector of optimization variables z consists of the final time t f and discrete grid nodes ûi , i.e., z T = t f , ûT 1 , ûT 2 .The NLP problem is formulated as direct single shooting and solved by a standard SQP method implemented in the MATLAB function fmincon, in which the gradients of the Lagrange function in Eq. ( 28) and ( 29) are computed with the proposed adjoint method.In addition, the Hessian of the Lagrange function is computed by a BFGS method.

Task description and optimization problem
In the following example, the robot depicted in Fig. 1 is considered in the time-optimal control problem.All structural components are modeled as rigid bodies.The robot is described with a minimal set of generalized coordinates ϕ 1 and ϕ 2 .The system dynamics are obtained with a coupled first-order differential equation by introducing the state variables where φi = ω i .This model has been studied by several authors for time-optimal control problems, e.g., in [10].The mass of the first body and the mass of the TCP is given by m 1 = m 3 = 1 kg, the mass of the second body is m 2 = 0.5 kg, and the length of both links is l 1 = l 2 = 1 m.The moment of inertia of both bodies around their centers of gravity is defined as J i = m i l 2 i /12.The cost functional of the optimization problem is given in Eq. ( 49), and the final constraints read where x f = 1 m and y f = 1 m denote the desired final configuration of the TCP in the workspace W. Physical limitations of the controls are considered with the upper bounds u 1,max = 4 Nm and u 2,max = 2 Nm.Moreover, the weights for the penalty approach are chosen as μ 1 = μ 2 = 10.The initial conditions of the states are defined by x 0 = (−π/4, 0, 0, 0) T .The control u i is discretized with k = 50 grid nodes and uniform intervals in the normalized time domain τ , i.e., the number of optimization variables is z = 101.As an initial guess, the final time is taken as t f = 3 s, and the grid nodes are set to zero, ū = 0.
Fig. 2 Initial controls, optimal controls, and switching functions for a time-optimal rest-to-rest motion of the rigid SCARA model

Results
Figure 2 shows the time-optimal control history for both controls in the normalized time domain, where the optimized final time is given by t * f = 1.8294 s.The results are in accordance with the defined final constraints in Eq. ( 53), and the controls converge to bang-bang solutions with respect to the control parameterization.Note that the switching function h i in Eq. ( 43) is derived from an indirect method and evaluated in terms of the optimal SQP solution.However, the resulting switching function agrees well with the defined control in Eq. ( 44), and the Hamiltonian of the system is sufficiently small.The optimization result is robust with respect to initial guesses, which implies that the proposed approach converges even if the initial guess is far away from the optimal solution.
Table 1 compares the number of function evaluations when numerical or adjoint gradients are used in the direct optimization method for converged solutions z * .In the table, the controls are discretized with k = 5 and k = 50 grid nodes each.We can see that the number of function evaluations in the case of numerical gradients is in general higher when compared to the case where analytical gradients are used.This becomes even more obvious when the number of optimization variables is increased.In addition, the computational cost is reduced for a large number of grid nodes by using adjoint gradients, since the adjoint system does not depend on the number of optimization variables.Note that the number of iterations does not depend on the way the gradients are computed, i.e., both gradients point in the same direction.To be more precise, the number of function evaluations counts the number of evaluations of the cost functional and the number of evaluations of the constraints.
In the case of using finite differences, note that for each evaluation of the constraints, the state equations must be solved.Hence we observe (in general) a high computational effort.In comparison, the function evaluations for analytically derived adjoint gradients are much fewer.Here, although the calculation of the gradient is complicated, the state equations have to solved only once, and just one system of ordinary differential equations (the adjoint system) needs to be solved, which is independent of the number of grid nodes k.The numbers in the table are of course problem-dependent and also dependent on the solver settings, thus also on the number of iterations.

Flexible SCARA
Modern robot design includes innovative lightweight techniques to reduce mass and energy consumptions in production lines.Therefore optimal control problems have to be defined for flexible multibody systems, in which the flexible components have to be able to describe large deformations during motion.Multibody systems with flexible components are often underactuated systems, and the optimal control problem becomes more complicated in comparison to fully actuated systems [32].
In this paper, we use the ANCF in the second example examining the effects due to elasticity of the SCARA.The ANCF has been developed particularly for solving large deformation problems in multibody dynamics [34].Contrary to classical nonlinear finite element approaches used in the literature, the ANCF does not use rotational degrees of freedom and therefore does not necessarily suffer from singularities emerging from angular parameterizations.The most essential advantage of the ANCF is the fact that the mass matrix is constant with respect to the generalized coordinates.The following example is intended to show the applicability of the proposed method for solving time-optimal control problems of highly flexible multibody systems.Therefore we use a standard ANCF element, which is available and has been tested extensively in the literature; see, e.g., [1].

Equations of motion
Based on the proposed ANCF formulation of Berzeri and Shabana [1], a two-node element is described in the global coordinate system with the generalized coordinates q T = r (1) T , r (1)   χ where r (i) denotes the nodal position vector, and r (i) χ represents the nodal slope vector of the ith node.Figure 3 shows the used ANCF element in a deformed configuration including the generalized coordinates.
An arbitrary point on the undeformed configuration is expressed as χ ∈ [0, l], where l is the original length of the beam.The position vector of the beam model is defined as where the shape function matrix S maps the generalized coordinates into the global position vector in the reference frame of the workspace W.
The governing equations of a single beam element require the mass matrix and generalized forces.The mass matrix of an element is defined by using the kinetic energy Fig. 3 ANCF element in a deformed configuration and the normalized beam length in the undeformed configuration ξ = χ/l ∈ [0, 1].Note that the ANCF mass matrix is constant, which leads to numerical advantages.Additionally, the elastic forces vector Q k , external applied torques Q u , and damping forces Q d have to be defined.The elastic forces vector is defined as where the strain energy due to longitudinal and bending deformations is used.Here E represents the Young modulus, A is the cross-sectional area, and I is the second moment of area.The curvature κ is defined with the Serret-Frenet formulas [13] and the longitudinal strain ε is formulated with the nonlinear Green-Lagrange strain measure where In addition to elastic forces, the principal of virtual work due to a torque M i acting on the angle of rotation of the cross-section θ leads to the generalized force Q i .Hence generalized forces associated with control u and damping f d = −d θ in the revolute joint are given by where d is the viscous damping coefficient.Finally, the equations of motion for a single beam element can be obtained as Introducing the generalized velocities v = q as additional variables transforms the secondorder differential equation for q into a first-order system Instead of using the augmented formulation of Eq. ( 65) to obtain the equations of motion for N connected elements, it is possible to define an independent set of generalized coordinates to obtain the equations of motion for a constrained multibody system in the form of Eq. (66).Remark that the system Jacobians are calculated with a symbolic toolbox, simplified and factorized to reduce the complex expressions for efficient use.Note that in the two-arm SCARA example, the revolute joint between the first arm and the ground and the revolute joint between the two arms reduce the number of generalized coordinates.The following set of parameters is used in the optimization procedure: the mass of beams m 1 = m 2 = 2 kg, the beam length in undeformed configuration l 1 = l 2 = 1 m, the viscous damping coefficient d 1 = d 2 = 0.1 Nm/rad, the axial stiffness E 1 A 1 = E 2 A 2 = 300 N, and the bending stiffness Moreover, an additional mass attached to the TCP m 3 = 0.5 kg is considered, which has to be taken into account in Eq. ( 56) for the kinetic energy of the second beam.

Optimization problem
Similarly to the example in Section 4.1, the cost functional is given in Eq. ( 49), and the final constraints for the TCP read where x f = x f , y f T with x f = 1 m and y f = 1 m denotes the desired final configuration of the TCP in the workspace W. The state variables of the remaining nodes are not specified and therefore free.Physical limitations of the controls are considered with the upper bounds u 1,max = u 2,max = 1 Nm.The weights for the penalty approach are chosen as μ 1 = μ 2 = 50.The initial state of the flexible SCARA is defined in the undeformed configuration as in the rigid example with ϕ 1 = −π/4 rad and ϕ 2 = 0 rad.In this example, the control u i is discretized with k = 10 grid nodes and uniform intervals in the normalized time domain τ , i.e., the number of optimization variables is z = 21.As an initial guess, the assumption for the final time is t f = 5 s, and the grid nodes are set to zero, ū = 0.

Results
Figure 4 shows the time-optimal control history for both controls in the normalized time domain τ , where the optimized final time is given by t * f = 3.7289 s.The results are in accordance with the defined final constraints in Eq. (67), and the optimality conditions in Eqs. ( 36)-(39) are sufficiently small.Snapshots of the time-optimal motion are illustrated in Fig. 5, where the nodal slope vector r (i)  χ is scaled for improved representation of the structural flexibility.

Conclusion and outlook
In this paper, we presented a gradient-based technique to show a new perspective on the optimality conditions in time-optimal control problems of dynamical systems considering final constraints.Conventional solutions based on a nonlinear programming approach/SQP approach utilize the adjoint variables to assess the optimality conditions regarding the Hamiltonian function.The use of adjoint gradients for a discrete control parameterization is presented in two examples.The gradients are used to replace numerical gradients in a direct optimization method and to evaluate the optimality conditions in terms of a Hamiltonian.The present work illustrates the computational advantage especially by application of the adjoint gradients for discrete control parameterizations in time-optimal control problems of flexible multibody systems including a high number of degrees of freedom.Moreover, the comparison of function evaluations of numerical and adjoint gradients can provide a future perspective for significantly reducing the computational burden when applied to highly dimensional complex multibody system applications.
In a future work, the proposed approach can be extended to formulate the multibody system by a set of redundant coordinates similarly to [23].Considering differential-algebraic equations requires consistent boundary conditions for the adjoint variables.A similar approach as proposed by Gear, Gupta, and Leimkuhler [11] can be used to overcome this issue.

Appendix: Formulation of cubic spline parameterization
The original infinite-dimensional optimization problem in Eqs. ( 1)-(3) has to be transformed into a finite-dimensional one to carry out a direct optimization method.This procedure is usually denoted as a direct transcription method [3].In general, the literature provides various formulations that can be pursued to perform such a transformation.All methods result in a vector z to describe the control history.One common method is to carry out a time discretization of the control and an interpolation between the resulting subintervals; e.g., Steiner and Reichl [36] used a linear dependency between the subintervals to minimize a cost functional.A higher-order interpolation scheme is obtained using cubic splines, e.g., in [33].In the present work, we use a cubic interpolation scheme of the control history u(t): where s i is the ith cubic spline segment for t ∈ [t i , t i+1 ], and s ∈ N represents the number of piecewise defined spline functions.A given grid node is expressed with ûi , and {b i , c i , d i } are constant spline parameters associated with the ith segment s i .To determine a spline with C 2 continuity, we require that This set of equations leads to a linear system with h i := t i+1 − t i : Since the number of linear equations x = 3s − 2 in Eq. ( 70) is lower than the number of unknowns y = 3s, the linear system is underdetermined.The first and second time derivatives of the splines s 0 (t 0 ) and s s−1 (t s ) are still undefined and can be used to determine a unique solution of the spline parameters p b,c,d .One option is to prescribe the velocity of the first and last spline segment s 0 (t 0 ) = s s−1 (t s ) = 0: (77) The Boolean vector δ ∈ R s picks a certain quantity corresponding to the time interval t ∈ [t i , t i+1 ], and the components are defined by δ i := 1 for t ∈ [t i , t i+1 ] with i = 0, 1, . . ., s − 1, 0 otherwise, (78) using the Kronecker delta, e.g., δ = (0, 1, 0, 0) for s = 4 spline segments in the second time interval.In this sense, the Boolean matrix P maps all 3s spline parameters into those active in the ith time interval t ∈ [t i , t i+1 ].Now using Eq. ( 74), the control history can be expressed as a simple vector multiplication Note that all the information of all spline segments is given in c.Instead of using a single control u, generalizations of Eq. ( 79) for m controls are readily given by where ūT = ûT The interpolation scheme in Eq. ( 81) is used in Section 2.2.2 to describe a continuous control history.The variation of grid nodes leads to i.e., an update of the grid nodes δ ū in the optimization procedure leads to an update of the control function u(t), shown in Fig. 6 for a single control.It must be emphasized that all m controls have to be discretized with the same number of grid nodes k and time intervals [t i , t i+1 ].The discrete control parameterization can be applied for optimal control problems in direct optimization methods, but also in the same manner in indirect optimization methods.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

3. 1
Procedure for the use of the adjoint gradients 1. Select an initial final time t f and initial grid nodes ū.In time-optimal control problems, it is numerically advantageous to use a normalized time domain τ = t/t f ∈ [0, 1].Derivatives with respect to the original time coordinate are given by d(•)/dt = 1/t f (•) , where the variable (•) is a function of the normalized time domain τ .2. Solve the state equations related to initial conditions x = t f f(x, u) with x(0) = x 0 (45) using an ODE solver.3. Compute the adjoint variables p(τ ) by solving the linear time-variant final value problem backward in time p = −t f P T x + f T x p with p(1) = 0. (46) 4. Compute the adjoint variables R(τ ) using the matrix differential equation

Fig. 1
Fig.1Planar two-arm robot with rigid bodies in a general configuration

Fig. 6
Fig.6 Effect of updating grid nodes ûi and ûi+1 on the continuous control function u(t)

1 ,
. . ., ûT m ∈ R m•kcollects all grid nodes k of the m control inputs, and the sparse block diagonal matrix C reads

Table 1
Comparison of two different approaches to provide gradients in the SQP routine for converged solutions