# Spatio-temporal stiffness optimization with switching dynamics

## Abstract

We address the optimal control problem of robotic systems with variable stiffness actuation (VSA) including switching dynamics and discontinuous state transitions. Our focus in this paper is to consider dynamic tasks that have multiple phases of movement, contacts and impacts with the environment with a requirement of exploiting passive dynamics of the system. By modelling such tasks as a hybrid dynamical system with time-based switching, we develop a systematic methodology to simultaneously optimize control commands, time-varying stiffness profiles and temporal aspect of the movement such as switching instances and total movement duration to exploit the benefits of VSA. Numerical evaluations on a brachiating robot driven with VSA and a hopping robot equipped with variable stiffness springs demonstrate the effectiveness of the proposed approach. Furthermore, hardware experiments on a two-link brachiating robot with VSA highlight the applicability of the proposed framework in a challenging task of brachiation.

## Keywords

Hybrid dynamics Passive and intrinsic dynamics Optimal control Temporal optimization Variable stiffness actuation## 1 Introduction

Towards the aim of achieving highly dynamic and flexible movements in close interaction with the environment, a number of variable stiffness actuators (VSAs) have been recently developed (Van Ham et al. 2007; Catalano et al. 2010; Hurst et al. 2010; Eiberger et al. 2010; Jafari et al. 2010) (see (Van Ham et al. 2009) for reviews). VSAs are composed of mechanically adjustable compliant (passive) mechanisms with the capability of simultaneous modulation of stiffness and output torque. In contrast to conventional stiff actuators, VSAs are expected to have desirable properties such as intrinsic compliance, energy storage capability with potential applications in human-robot interaction and improvements of task performance in dynamic tasks.

Despite potential benefits of variable stiffness joints, finding an appropriate control strategy to fully exploit the capabilities of VSAs is challenging due to the increased complexity of mechanical properties and the number of control variables (redundancy in actuation). Taking an optimal control approach, recent studies have investigated the benefits of VSA such as energy storage in explosive movements from a viewpoint of performance improvement (Braun et al. 2012, 2013; Garabini et al. 2011; Haddadin et al. 2011). Such benefits of VSAs in a ball throwing task has been demonstrated by simultaneously optimizing time-varying torque and stiffness profiles of the actuator (Braun et al. 2012) with a focus on an optimal control formulation under actuation constraints for complex hardware mechanisms in VSAs (Braun et al. 2013). An optimal control problem of maximizing link velocity with VSA models has been investigated by Garabini et al. (2011) and Haddadin et al. (2011). It is shown that much larger link velocity can be achieved than that of the motor in the VSA with the help of appropriate stiffness adjustment during a hitting movement. In a similar problem, Hondo and Mizuuchi (2011) have discussed the issue of determining the inertia parameter and spring constant in the design of series elastic actuators to increase peak velocity. In robot running, Karssen and Wisse (2011) have presented numerical studies to demonstrate that an optimized nonlinear leg stiffness profile could improve robustness against disturbances.

However, traditional approaches have focused on the optimal control formulation over a predetermined time horizon with smooth, continuous plant dynamics. When considering tasks that consist of multiple phases of movements including switching dynamics and discrete state transition (arising from interaction with the environment), an individual phase-by-phase optimization strategy could result in a suboptimal solution.

In this paper, we investigate spatio-temporal stiffness optimization in such problems in order to exploit the benefits of VSA. In addition to optimizing control commands and stiffness, we develop a systematic methdology to simultaneously optimize the temporal aspect of the movement (e.g., movement duration). We address optimization problems for tasks with multi-phase movements including switching dynamics, impacts and contacts with environments in tasks requiring exploitation of intrinsic dynamics in underactuated systems.

In order to demonstrate the effectiveness of the proposed approach, we present numerical evaluations of robot brachiation and hopping driven by VSA. In addition, we report hardware implementation of the proposed approach on a physical two-link brachiating robot with VSA to demonstrate its applicability in achieving highly dynamic movements under real-world conditions.

### 1.1 Spatio-temporal optimization of multi-phase movements

Dynamics with intermittent contacts and impacts such as locomotion and juggling are often modelled as hybrid dynamical systems which consist of (multiple sets of switching) continuous dynamics and discontinuous state transition determined by switching surfaces (state based switching) (Bätz et al. 2010; Grizzle et al. 2001; Rosa et al. 2012; Long et al. 2011). From a control theoretic perspective, a significant effort has been made to address optimal control problems of various class of hybrid systems (Branicky et al. 1998; Sussmann 1999; Xu and Antsaklis 2003, 2004). However, illustrative examples in the control literature are confined to low-dimensional and simple dynamical systems, and only a few robotic applications can be found for optimization of movements over multiple phases (Buss et al. 2002; Long et al. 2011). Instead of using hybrid dynamics modelling, different optimization approaches to dealing with multiple contact events have been proposed. Tassa et al. (2012) have proposed an iLQG (iterative linear quadratic Gaussian)-based (Li and Todorov 2007) model based predictive control with smooth approximation of contact forces for online motion synthesis of a simulated humanoid model without the need of switching dynamics. A direct trajectory optimization method for rigid body systems subject to collisions by sequential quadratic programming has been proposed (Posa et al. 2014). In the study by Posa et al. (2014), contact forces are explicitly included as constraint forces with complementarity conditions and directly optimized together with trajectories and control commands.

Solving hybrid optimal control problems with state based switching is non-trivial even if the number and the sequence of switching are known a priori. One of the reasons is that additional constraints need to be satisfied such that the states must lie on the switching surface at the instance of switching. Furthermore, it is necessary to find the time of the switching instance influenced by the control commands, whose analytical expression is difficult to obtain in general (Xu and Antsaklis 2004). These conditions are equivalent to having several interior-point constraints forming a multipoint boundary value problem, which is generally known to be hard to find solutions (Bryson and Ho 1975).

Thus, we suggest an approximate approach to the hybrid optimal control problem, where the multiple phases of movement are modelled as time-based switching hybrid dynamics assuming that the sequence of switching is known. Necessary conditions for optimality in the case of time-based switching is simpler than those of state-based switching (Xu and Antsaklis 2004) and can be dealt with based on the optimization method we use in this paper.

- 1.
use of nonlinear time-based switching dynamics with continuous control input to model the dynamics of multi-phase movements;

- 2.
use of nonlinear discrete state transition to model contacts and impacts;

- 3.
use of realistic plant dynamics with a VSA model;

- 4.
introduction of a composite cost function to describe task objectives with multi-phase movements;

- 5.
simultaneous optimization of joint torque and stiffness profiles across multiple phases;

- 6.
optimization of switching instances and total movement duration.

*feedback*control law, while many trajectory optimization algorithms typically compute only optimal

*feedforward*controls. Discussions on alternative optimal control approaches such as indirect methods and direct methods can be found in (Braun et al. 2013).

## 2 Problem formulation

We present a general formulation of optimal control problems for tasks with multiple phase movements including switching dynamics and discrete state transition arising from interactions with an environment.

### 2.1 Robot dynamics with variable stiffness actuation

*i*denotes the

*i*th subsystem, \(\mathbf{q}\in \mathbb {R}^n\) is the joint angle vector, \(\mathbf{q}_m \in \mathbb {R}^m\) is the motor position vector of the VSA, \(\mathbf{M}_i \in \mathbb {R}^{n \times n}\) is the inertia matrix, \(\mathbf{C}_i \in \mathbb {R}^n\) is the Coriolis term, \(\mathbf{g}_i \in \mathbb {R}^n\) is the gravity vector, \(\mathbf{D}_i \in \mathbb {R}^{n \times n}\) is the viscous damping matrix, and \(\varvec{\tau }\in \mathbb {R}^n\) are the joint torques from the variable stiffness mechanism. In the equations above, (1) denotes the rigid body dynamics of the robot and (2) denotes the servo motor dynamics in the VSA. In (2), \(\varvec{\alpha }\) determines the bandwidth of the servo motors

^{1}and \(\mathbf{u}\in \mathbb {R}^m\) is the motor position command (Braun et al. 2012). We assume that the range of the control command \(\mathbf{u}\) is limited as \(\mathbf{u}_{ min } \preceq \mathbf{u}\preceq \mathbf{u}_{ max }\). Note that since the motor dynamics (2) are critically damped, the range constraint on the servo motor positions \(\mathbf{u}_{ min } \preceq \mathbf{q}_m \preceq \mathbf{u}_{ max }\) can also be imposed

^{2}(Braun et al. 2012).

### 2.2 State space representation

### 2.3 Hybrid dynamics with time-based switching and discrete state transition

An example of an instantaneous state transition is an impact map arising from an inelastic collision of the rigid body with an environment (Grizzle et al. 2001; Rosa et al. 2012; Bätz et al. 2010). In the proposed framework, equality constraints at the moment of switching will be approximately imposed by the via-point cost and appropriate time for the event will be found by optimizing switching instances as we discuss below. In this paper, the sequence of switching is assumed to be given. Figure 1 depicts a schematic diagram of a hybrid system we consider in this paper.

### 2.4 Movement optimization of multiple phases

*j*th switching instance and \(h(\mathbf{x}, \mathbf{u})\) is the running cost.

*j*th sequence as the initial condition for the \( (j+1) \)-st sequence with the discrete state transition (8). In this case, each cost function can be (locally) optimized. However, the total cost \(J = \sum _{j=1}^{K+1} J_j\) may be suboptimal.

For the given plant dynamics (7) and state transition (8), the optimization problem we consider is to a) find an optimal feedback control law \(\mathbf{u}= \mathbf{u}(\mathbf{x}, t)\) which minimizes the composite cost (9) and b) simultaneously optimize switching instances \(T_1, \ldots , T_k\) and the final time \(T_f\) as well.

## 3 Spatio-temporal optimization algorithm for timed switching dynamics and discontinuous state transitions

In this section, first we extend iLQR—an approximate local optimal feedback control solver (similar arguments apply for the stochastic equivalent iLQG (Li and Todorov 2007)) with generalization of the switched LQ control with state jumps (Xu and Antsaklis 2003) in order to incorporate timed switching nonlinear dynamics with discrete and discontinuous state transitions. Then, we present a temporal optimization algorithm to optimize the switching instances and the total movement duration. Note that traditional single phase movement optimization can be simply treated as a single set of continuous time dynamics without switching and discontinuous state transition.

### 3.1 Optimal control of switching dynamics and discrete state transition

In brief, the iLQR method solves an optimal control problem of the locally linear quadratic approximation of the nonlinear dynamics and the cost function around a nominal trajectory \(\bar{\mathbf{x}}\) and control sequence \(\bar{\mathbf{u}}\) in discrete time, and iteratively improves the solutions.

*k*is the discrete time step, \(\Delta t_j\) is the sampling time for the time interval \(T_j \le t < T_{j+1}\), and \(k_j\) is the

*j*th switching instance in the discretized time step. The sampling time \(\Delta t_j\) will be optimized for the purpose of temporal optimization as described in Sect. 3.2.

^{3}

^{4}

Once we have a locally optimal control command \(\delta \mathbf{u}\), the nominal control sequence is updated as \(\bar{\mathbf{u}} \leftarrow \bar{\mathbf{u}} + \delta \mathbf{u}\). Then, the new nominal trajectory \(\bar{\mathbf{x}}\) is computed by running the obtained control \(\bar{\mathbf{u}}\) and the above process is iterated until convergence (no further improvement in the cost within certain threshold). In (Tassa et al. 2012), methods for improving convergence and robustness properties of the iLQR/iLQG algorithms are presented in the context of online trajectory optimization.

### 3.2 Temporal optimization

*t*to a canonical time \(t'\) is introduced as

This iterative temporal optimization algorithm with alternative update of control commands and temporal parameters has been proposed in the general context of an inference based stochastic optimal control framework in (Rawlik et al. 2010; Rawlik 2013). As discussed in (Rawlik 2013), because it is generally intractable to obtain the combined optimal policy at once, the idea of the two step procedure has been introduced. In each iteration, improvement of control commands and temporal parameters will be performed to jointly reduce the cost function. In addition, in the control theoretic literature, a similar iterative two stage strategy has been proposed for a class of switching systems to optimize control commands and switching instances (Xu and Antsaklis 2004).

As a result of these iterative optimization procedures, at best, convergence to a locally optimal solution could be expected. Effectiveness of this approach has been illustrated in variable distance and via-point reaching tasks (Rawlik et al. 2010; Rawlik 2013) and in simple numerical examples of a class of switching systems (Xu and Antsaklis 2004).

## 4 Exploitation of passive dynamics with spatio-temporal optimization of stiffness

We explore the benefit of simultaneous variable stiffness and temporal optimization for tasks exploiting intrinsic dynamics of the system. We consider brachiation (Saito et al. 1994; Nakanishi et al. 2000; Gomes and Ruina 2005; Rosa et al. 2012) as an example of highly dynamic maneuver requiring utilization of passive dynamics for successful task execution.

### 4.1 Brachiating robot dynamics with VSA

We use MACCEPA (Van Ham et al. 2007) as our VSA implementation of choice. MACCEPA is one of the designs of mechanically adjustable compliant actuators with a passive elastic element (cf. Fig. 2). This actuator design has the desirable characteristics that the joint can be very passively compliant. This allows free swinging with a large range of movement by relaxing the spring. Thus, it is highly suitable for the brachiation task we consider. MACCEPA is equipped with two position controlled servo motors, \(\mathbf{q}_m=[\;q_{m1},\;q_{m2}\;]^T\), which control the equilibrium position and the spring pre-tension, respectively.

*q*is the joint angle

^{5},

*F*is the spring tension,

Model parameters of the two-link brachiating robot

Robot parameters | i = 1 | i = 2 |
---|---|---|

Mass | ||

\(m_i\) (kg) | 1.390 | 0.527 |

Moment of inertia | ||

\(I_i\) (kgm\(^2)\) | 0.0297 | 0.0104 |

Link length | ||

\(l_i\) (m) | 0.46 | 0.46 |

COM location | ||

\(l_{ci}\) (m) | 0.362 | 0.233 |

Viscous friction | ||

\(d_i\) (Nm/s) | 0.03 | 0.035 |

### 4.2 Optimization of single phase movement in brachiation task

A natural and desirable strategy for a swing movement in brachiation would be to make good use of gravity by making the joints passive and compliant. For a system with VSAs, our idea in exploiting passive dynamics is to frame the control problem in finding an appropriate (preferably small) stiffness profile to modulate the system dynamics only when necessary and compute the virtual equilibrium trajectory (Shadmehr 1990) to fulfill the specified task requirement.

*F*is the spring tension in the VSA. This objective function is designed in order to reach the target located at \(\mathbf{r}^*\) at the specified time

*T*while minimizing the spring tension

*F*in the VSA. Note that the main component in the running cost is to minimize the spring tension

*F*by the second term while the first term \(\mathbf{u}^T \mathbf{R}_1 \mathbf{u}\) is added for regularization with a small choice of the weights in \(\mathbf{R}_1\). In practice, this is necessary since

*F*is a function of the state and iLQR requires a control cost in its formulation to compute the optimal control law.

*F*has a similar role to the stiffness parameter

*k*as in the simplified actuator model \(\tau = -k(q-q_m)\). Another interpretation can be considered in such a way that if we linearize (34) around the equilibrium position assuming that \( q_{m1}-q \ll 1\), the relationship between the joint stiffness

*k*in (36) and the spring tension

*F*can be approximated as

*F*corresponds to minimizing the stiffness

*k*in an approximated way. Note that it is possible to directly use

*k*in the cost function. However, in practice, first and second derivatives of

*k*are needed to implement the iLQG algorithm which become significantly more complex than those of

*F*.

### 4.3 Benefit of temporal optimization

This section numerically explores the benefit of temporal optimization in exploiting natural dynamics of the system. One of the issues in a conventional optimal control formulation is that the time horizon needs to be given in advance for a given task. While on fully actuated systems, control can be used to enforce a pre-specified timing, it is not possible to choose an arbitrary time horizon on underactuated systems. In a brachiation task, determination of an appropriate movement horizon, i.e., matching the movement duration corresponding to the property of the natural dynamics of the pendulum-like swing motion, is essential for successful task execution with reduced control effort.

Consider the swing locomotion task on a ladder with the intervals starting from the bar at \(d_{ start }=0.42\) m to the target located at \(d_{ target }=0.46\) m (cf. Fig. 2). We optimize both the control command \(\mathbf{u}\) and the movement duration *T*. We use \(\mathbf{Q}_T = \mathrm {diag}(10000, 10000, 10, 10)\), \(\mathbf{R}_1 = \mathrm {diag}(0.0001, 0.0001) \) and \(R_2 = 0.01\) for the cost function in (38). The optimized movement duration was \(T=0.806\) s.

### 4.4 Benefit of stiffness variation

In this section, we investigate the benefit of time-varying stiffness modulation. One of the characteristics of VSAs is its ability to simultaneously modulate joint torque and stiffness. Modulating stiffness effectively alters the properties of the system dynamics such as natural frequency. Thus, the capability of modulating stiffness during the motion can be beneficial for improving the task performance.

We demonstrate the benefit of time-varying modulation of stiffness by comparing optimal *variable* stiffness control and optimal *fixed* stiffness control. In optimal *variable* stiffness control, both the control commands \(u_1(t)\) and \(u_2(t)\) in \(\mathbf{u}=[\;u_1(t),\;u_2(t)\;]^T\) are optimized in a time-varying manner during the movement to independently control the joint torque and the joint stiffness. In optimal *fixed* stiffness control, the command to the spring pre-tensioning servomotor is fixed to the optimal constant value throughout the movement, i.e., \(\mathbf{u}= [\;u_1(t),\;u_2\;]^T\) where \(u_2 = \mathrm {const.}\) Note that constant command to the pre-tensioning servomotor \(u_2\) does not necessarily mean constant joint stiffness with MACCEPA (Braun et al. 2012). In the case of optimal fixed stiffness control, it was possible to achieve the comparable swing movement for the same intervals of \(d_{ start }=0.42\) m and \(d_{ target }=0.46\) m as in Sect. 4.3 above. However, in optimal fixed stiffness control incurs a higher cost \(J=9.528\) than that of the corresponding optimal variable stiffness case with \(J=2.979\).

In addition, we compare the performance of variable and fixed optimal stiffness control in terms of the range of distances that can be reached with the robot. Starting from the bar at the nominal distance \(d_{ start }=0.42\) m, we vary the target positions \(d_{ target }\) by 0.01 m and optimize control commands and movement duration. When the endeffector position at \(t=T_f\) is within a tolerance of 0.01 m from the location of the target, we assume that the trial is successful. With optimal variable stiffness control, the robot was able to reach the target in the range of \(d_{ target } \in [0.39, 0.59]\) m (the range of 0.20 m) while with optimal fixed stiffness control, it was \(d_{ target } \in [0.42, 0.52]\) m (the range of 0.10 m). These numerical explorations illustrate the benefit of optimal *variable* stiffness control in terms of the cost and range of distances achieved in swing locomotion in comparison to optimal *fixed* stiffness control.

## 5 Spatio-temporal optimization of multiple swings in brachiation

We evaluate the effectiveness of the proposed approach in robot brachiation that incorporates switching dynamics and multiple phases of the movement. We present numerical simulations and experimental implementation on a physical two-link brachiating robot with VSA.

### 5.1 Brachiating robot model in hybrid dynamics formulation

### 5.2 Simulation results

*F*is the spring tension in the VSA. Note that this cost function includes the time cost \(w_T T_1\) for the swing up maneuver. We use \(\mathbf{Q}_T = \mathbf{Q}_{T_j} = \mathrm {diag}(10000, 10000, 10, 10)\), \(\mathbf{R}_1 = \mathrm {diag}(0.0001, 0.0001) \) and \(R_2 = 0.01\) and \(w_T=1\). In addition, we impose constraints on the range of the angle of the second joint during the course of the swing up maneuver as \(q_{2_{ min }} \le q_2 \le q_{2_{ max }}\), where \([q_{2_{ min }},q_{2_{ max }}]=[-1.745, 1.745]\) rad, by adding a penalty term to the cost (44). This is introduced considering the physical joint limit of the hardware platform used in this paper.

Figure 5a shows the sequence of the multi-phase movement of the robot optimized by the proposed algorithm including temporal optimization. The optimized switching instance and total movement duration are \(T_1 = 5.259\), \(T_2 = 6.033\) and \(T_f = 6.835\) s and the total cost is \(J = 37.815\). Figure 5b shows the optimized joint trajectories and servo motor positions. Note that at the instance of switching denoted by vertical lines, discrete state transition can be observed in these trajectories due to the definition of the coordinate transformation.

### 5.3 Hardware platform of a brachiating robot with VSA

The configuration of the brachiating robot with VSA is depicted in Fig. 6. The elbow joint is actuated with a VSA (MACCEPA (Van Ham et al. 2007)) having two servo motors (Hitec HS-7940TH). Each link is equipped with a gripper driven by a single servo motor (Hitec HSR-5990TG) to open and close it through a gear mechanism. The angle of the first link is obtained through an IMU unit (InvenSense MPU-6050) attached to the link. The angle of the second link is measured by a rotary potentiometer (Alps RDC503013A) at the elbow joint. The servo motor positions are measured through direct access to its internal potentiometer. The operating frequency of control and measurement is 1KHz. The length of each link is 0.46 m and the total mass is 1.92 kg. The link parameters are obtained from the CAD model while friction coefficients and the servo motor bandwidth parameters are estimated by fitting the actual responses of the robot. Our numerical exploration showed that with an inadequate mass distribution, it was difficult to find an optimal solution in achieving desired swing locomotion behavior. By this reason, the mass distribution of the robot was chosen to resemble the desirable natural dynamics required for the task (see Table 1 for the parameters).

### 5.4 Experimental results

Figure 7 shows the experimental result of swing locomotion on the ladder with the intervals of \(d_{ start }=0.42\) m and \(d_{ target }=0.46\) m. The optimized control commands with the optimal movement duration obtained in the corresponding simulation in Sect. 4.3 are used. In Fig. 7a, the movement of the robot is depicted while in Fig. 7b the joint trajectories and servo motor positions are shown. This result corresponds to the simulation in Fig. 3 with the optimal movement duration. In the experiments, we only use the open-loop optimal control command to the servo motors without state feedback as in (Braun et al. 2012).

The experiments above are presented in the accompanied video. These results demonstrate the effectiveness and feasibility of the proposed framework in achieving highly dynamic tasks in compliantly actuated robots with variable stiffness capability under real-world conditions.

## 6 Multi-phase optimization in hopping with VSA

In this section, we demonstrate the feasibility of the proposed approach on an increasingly challenging task of hopping which includes switching of different mode of dynamics (flight and stance) and more complex discontinuous state transition arising from impact at touch-down. Additional difficulty in this task is to find the flight and stance time for successful task execution which is highly restricted by the underactuated nature of the intrinsic dynamics and the desired task specifications.

We consider the hopping robot model in (Hyon and Emura 2004) with an augmentation of variable compliance elements in the hip and the leg actuators. The objective of optimization is to find appropriate leg and hip stiffness to exploit the passive dynamics and also the required flight and stance time during one locomotion cycle in a periodic movement based on a time-based switching approximation. The obtained controller is then applied to achieve multiple hopping cycles of locomotion on event based switching dynamics. Robustness of the obtained optimal *feedback* controller will be evaluated by applying external disturbances during the multiple hopping cycles to demonstrate the feasibility of the optimized controller.

### 6.1 Dynamics model of a hopping robot

*M*is the mass of the body, \(J_l\) is the leg inertia, \(J_b\) is the body inertia,

*g*is the gravitational constant, \(r_0\) is the nominal length of the leg spring. \(\tau _{ hip }\) and \(\tau _{ leg }\) are the torque applied to the hip joint and the force applied to the leg by the VSAs as given in (48) and (49) below, respectively. We use the parameters \(M=11.0\) kg, \(J_b=2.5\) kgm\(^2\), \(J_l=0.25\) kgm\(^2\) and \(r_0=0.7\) m adopted from (Ahmadi and Buehler 1997).

For the purpose of optimization, the dynamics will be formulated in a state space representation of the form of \(\dot{\mathbf{x}}=\mathbf{f}_i(\mathbf{x}, \mathbf{u})\) as in (5) with the full state vector \(\mathbf{x}=[\;\mathbf{q}^T,\;\dot{\mathbf{q}}^T\;]^T\) and \(\mathbf{q}=[x_{ com },y_{ com },\theta ,\phi ,r]^T\). In this hopping robot, we consider a simplified parallel elastic VSA model with direct force/torque and stiffness control as in (Hyon and Emura 2004), which does not include the motor dynamics^{6} (2).

### 6.2 Design of composite cost function

In this paper, we consider a task of achieving periodic movement of continuous hopping which is a repetition of one hopping cycle while exploiting the passive dynamics and the benefits of stiffness modulation. For this purpose, first, we design a composite cost function for one hopping cycle including both the flight and stance phases and the desirable touch-down condition. Then, the obtained controller is applied to achieve multiple cycles.

### 6.3 Simulation results

We choose the desired initial condition at lift off as \(\dot{x}_0=2.0\)m and \(\theta _0=-6.0\) deg (\(-0.105\) rad). The rest is obtained based on an approximated condition of passive running (Hyon and Emura 2004) for the nominal model of Ahmadi and Buehler (1997). With this initial condition, the passive dynamics (no control) of the robot can achieve several steps of running as reported by Ahmadi and Buehler (1997) and Hyon and Emura (2004). However, eventually, it will fail since passive running is intrinsically unstable.

Using the proposed method, we simultaneously obtained the optimal feedback control for the control commands \((u_1,u_2)\) and stiffness \((u_3, u_4)\), and found the flight time \(T_1\) and the period for one complete cycle \(T_f\) for one hopping cycle. The optimized flight time and one hopping cycle were \(T_1 = 0.410\) s and \(T_f = 0.487\) s. Since the obtained controller is based on an assumption of time-based switching, there could be some mismatch in the exact timing in the switching condition when applied to realistic event based switching dynamics (flight to stance at touch-down \(y_{ foot }= y_{ ground }\), stance to flight when \(r=r_0\)) to achieve multiple cycles of locomotion. One of the benefits of our approach is that it provides a locally optimal *feedback* control, deviations from the optimal trajectory can be corrected, which will be illustrated in the following examples. These simulation results are presented in the accompanied video.

*Comparison to individual phase optimization* As a comparison, we optimized the control command, stiffness and the movement duration in a sequential manner individually for the flight phase subsequently followed by the stance phase for one cycle of the movement. The optimized movement duration for the flight phase was \(T_{1, ind } = 0.410\) s and for the stance phase was \(T_{ stance,ind }=0.080\) s, i.e., the total duration was \(T_{ f,ind }=0.490\) s. The total cost for this individual optimization was \(J_{ ind }=1.686\) which is comparable to the complete optimization case \(J_{ comp }=1.624\) mentioned above. The optimized trajectories, control commands and stiffness profiles are similar between these two cases. However, interestingly, there are notable difference in the robustness of the controller when these two were applied to the event based switching dynamics where the role of the *feedback* control becomes prominent.

The controller with complete cycle optimization was able to achieve continuous stable running over multiple cycles. However, with the controller obtained by individual optimization, the robot failed to continue to run after 25 steps of hopping. Although this is an empirical observation, this difference presumably came from the difference in the optimal *feedback* gains. In the complete cycle optimization, the optimal feedback gains take the future goal until the end of the hopping cycle into account including both the flight and stance phases with the via-point and terminal costs. However, in the individual optimization, corrections are made only considering the immediate goal specified by the terminal cost in each phase. This result highlights the benefits of optimizing the whole cycle of the movement in comparison to individually optimizing the movement in a sequential manner. This comparison is demonstrated in the accompanied video.

*Robustness against perturbations* In this simulation, we evaluate the robustness of the obtained optimal feedback controller by applying external perturbations while the robot is running. At \(t=1.0\) s, the robot is pushed forward with \(F_x=150\) N and at \(t=2.0\) s, a backward perturbation is applied \(F_x=-250\) N for the duration of 0.05 s, respectively. Figure 10a depicts the movement of the robot from \(t=0.7\) to \(t=3.9\) s. Figure 10b show the forward velocity \(\dot{x}\) (top), body height \(y_{ com }\) (middle), and leg angle \(\theta \) and hip angle \(\phi \) (bottom) from \(t=0\) to \(t=6\) s. Figure 10c show the control commands \(u_1\) and \(u_2\) (top), hip stiffness \(u_3\) (center) and leg stiffness \(u_4\) (bottom). The simulation result illustrate that after the perturbations, the robot was able to stabilize the periodic running behavior without falling over demonstrating the robustness of the optimal feedback controller and the feasibility of the proposed approach in this problem setting. This result is illustrated in the accompanied video.

## 7 Discussion

In this section, we discuss the benefit, practical considerations and possible limitations of the proposed approach.

In Sect. 4.3, benefit of temporal optimization was presented with an example of swing locomotion in a VSA actuated brachiating robot in order to exploit the intrinsic dynamics of the system. Much more significant effect of temporal optimization with a torque controlled brachiating robot in our previous work (Nakanishi et al. 2011) was illustrated in terms of the required control torque to achieve swing motion. With the optimized movement duration, only a very small amount of torque was needed to achieve the task. However, slight change in the movement time resulted in significant increase in the required control command. Furthermore, in the case of periodic movement optimization (Nakanishi et al. 2011), finding an appropriate frequency of the motion matching the natural frequency of the system could reduce the required control effort. These results highlight the benefit of temporal optimization in exploiting the natural dynamics.

In Sect. 5.2, benefit of multi-phase movement optimization was presented in a brachiation example in terms of the improvement in the cost and performance in comparison to individual phase optimization. The main difference from individual phase optimization is that multi-phase optimization takes the future goals into account. The effect of multi-phase optimization in the brachiation example may be less intuitive since every time the robot grasps the bar, joint velocities are reset. In this case, the final positions of the VSA servomotors in each phase can be appropriately determined considering the next phase movement to adjust the spring tension. The example of a via-point reaching task in (Rawlik et al. 2010; Rawlik 2013) demonstrates the benefit of multi-phase optimization more clearly where the velocity and the resultant curvature of the trajectory when passing the via-point can be determined considering the next target position.

One of the practical considerations of the time-based switching approach is the feasibility and accuracy of approximation of the switching condition. In general, accuracy of this approximation largely depends on the nature of the task and the design of the cost function since the switching condition is effectively imposed by the via-point and terminal costs at their corresponding time. In the case of brachiation, considering the design of the gripper in the physical robot, empirically, small error at the endpoint was tolerated assuming that the robot was able to grasp the bar. In the hopping example, as the ground contact condition is more critical, the weights of the cost were empirically adjusted in order to reduce the mismatch. In addition, temporal optimization was helpful in reducing the error by finding an appropriate movement duration which cannot be arbitrarily predetermined. Our empirical results suggest the small mismatch in the switching condition can be alleviated by the use of proper state feedback. It would be of our future interest to evaluate the robustness of the obtained optimal controller against such a mismatch in a more systematic manner.

In terms of feasibility, application of the proposed spatio-temporal optimization approach could be limited to the cases where the switching condition can be represented by a cost function (penalty) with an assumption of known order of switching and we have a reasonable initial estimate of the desirable duration of the movement. If the sequence of the switching is not given a priori, direct trajectory optimization approaches with state-based switching could be more suitable, e.g., (Posa et al. 2014). If we do not consider multi-phase movement optimization as a whole, i.e., when only individual movement optimization is considered, it would be possible to use the event based first-exit strategy as in (Kulchenko and Todorov 2011). However, this would result in a sub-optimal solution overall as discussed in (Rawlik 2013). In these methods, temporal optimization becomes more difficult since the movement time (switching instance) depends on the resultant system’s trajectory as discussed in (Xu and Antsaklis 2004).

## 8 Conclusion

In this paper, we have presented a systematic methodology for movement optimization with multiple phases and switching dynamics in robotic systems with VSA with the focus on exploiting intrinsic dynamics of the system. Tasks including switching dynamics and interaction with an environment are approximately modelled as a hybrid dynamical system with time-based switching. We have demonstrated the benefit of simultaneous temporal and variable stiffness optimization leading to reduction in control effort and improved performance. With an appropriate choice of the composite cost function to encode the task, we have demonstrated the effectiveness of the proposed approach in various example tasks in numerical simulations and hardware implementation in a brachiating robot with VSA. Future work will aim at investigation of optimization in biped locomotion with VSA including variable damping (Enoch and Vijayakumar 2016; Radulescu et al. 2012) as well as an extension to learning approaches to address modelling uncertainties of the system dynamics (Mitrovic 2010).

## Footnotes

- 1.
\(\varvec{\alpha }=\mathrm {diag}(a_1, \ldots , a_m)\) and \(\varvec{\alpha }^2=\mathrm {diag}(a_1^2, \ldots , a_m^2)\) for notational convenience.

- 2.
\(\preceq \) denotes component-wise inequality.

- 3.
For notational convenience, note that in (16), \(\phi _{\mathbf{x}}\) and \(\phi _{\mathbf{x}\mathbf{x}}\) denote \(\phi _{\mathbf{x}}=\frac{\partial \phi }{\partial \mathbf{x}}\) and \(\phi _{\mathbf{x}\mathbf{x}}=\frac{\partial ^{2} \phi }{\partial \mathbf{x}^2}\), respectively. Similar definitions apply to other partial derivatives.

- 4.
At the final time \(k=N\), \(\mathbf{S}_N=\phi _{\mathbf{x}\mathbf{x}}\) and \(\mathbf{s}_N=\phi _{\mathbf{x}}\).

- 5.
In the brachiating robot model, \(q=q_2\).

- 6.

## Notes

### Acknowledgments

This work was supported by the European Union Seventh Framework Programme as part of the STIFF and TOMSY projects. We would like to thank Andrius Sutas for the development of the electronics and control interface of the hardware and Alexander Enoch for the design of the earlier version of the robot arm. Also, we would like to thank Matthew Howard, Takeshi Mori, and Konrad Rawlik for discussions on this study.

## Supplementary material

Supplementary material 1 (mpeg 87556 KB)

## References

- Ahmadi, M., & Buehler, M. (1997). Stable control of a simulated one-legged running robot with hip and leg compliance.
*IEEE Transactions on Robotics and Automation*,*13*(1), 96–104.CrossRefGoogle Scholar - Bätz, G., Mettin, U., Schmidts, A., Scheint, M., Wollherr, D., & Shiriaev, A. S. (2010). Ball dribbling with an underactuated continuous-time control phase: Theory & experiments. In
*IEEE/RSJ international conference on intelligent robots and systems*(pp. 2890–2895).Google Scholar - Branicky, M. S., Borkar, V. S., & Mitter, S. K. (1998). A unified framework for hybrid control: Model and optimal control theory.
*IEEE Transactions on Automatic Control*,*43*(1), 31–45.MathSciNetCrossRefzbMATHGoogle Scholar - Braun, D., Howard, M., & Vijayakumar, S. (2012). Optimal variable stiffness control: Formulation and application to explosive movement tasks.
*Autonomous Robots*,*33*(3), 237–253.CrossRefGoogle Scholar - Braun, D. J., Petit, F., Huber, F., Haddadin, S., van der Smagt, P., Albu-Schäffer, A., et al. (2013). Robots driven by compliant actuators: Optimal control under actuation constraints.
*IEEE Transactions on Robotics*,*29*(5), 1085–1101.CrossRefGoogle Scholar - Bryson, A. E., & Ho, Y. C. (1975).
*Applied optimal control*. New York: Taylor & Francis.Google Scholar - Buss, M., Glocker, M., Hardt, M., von Stryk, O., Bulirsch, R., & Schmidt, G. (2002). Nonlinear hybrid dynamical systems: Modeling, optimal control, and applications. In S. Engell, G. Frehse, & E. Schnieder (Eds.),
*Modelling, analysis, and design of hybrid systems. Lecture notes in control and information science*(pp. 311–335). Berlin: Springer.Google Scholar - Caldwell, T. M., & Murphey, T. D. (2012). Single integration optimization of linear time-varying switched systems.
*IEEE Transactions on Automatic Control*,*57*(6), 1592–1597.MathSciNetCrossRefGoogle Scholar - Catalano, M. G., Schiavi, R., & Bicchi, A. (2010). Mechanism design for variable stiffness actuation based on enumeration and analysis of performance. In
*IEEE international conference on robotics and automation*(pp. 3285–3291).Google Scholar - Egerstedt, M., Wardi, Y., & Delmotte, F. (2003). Optimal control of switching times in switched dynamical systems. In
*IEEE conference on decision and control*(pp. 2138–2143).Google Scholar - Eiberger, O., Haddadin, S., Weis, M., Albu-Schäffer, A., & Hirzinger, G. (2010). On joint design with intrinsic variable compliance: Derivation of the DLR QA-Joint. In
*IEEE international conference on robotics and automation*(pp. 1687–1694).Google Scholar - Enoch, A., & Vijayakumar, S. (2016). Rapid manufacture of novel variable impedance robots.
*Journal of Mechanisms and Robotics*,*8*(1), 553–567.Google Scholar - Garabini, M., Passaglia, A., Belo, F., Salaris, P., & Bicchi, A. (2011). Optimality principles in variable stiffness control: The VSA hammer. In
*IEEE/RSJ international conference on intelligent robots and systems*(pp. 3770–3775).Google Scholar - Gomes, M. W., & Ruina, A. L. (2005). A five-link 2D brachiating ape model with life-like zero-energy-cost motions.
*Journal of Theoretical Biology*,*237*(3), 265–278.MathSciNetCrossRefGoogle Scholar - Grizzle, J. W., Abba, G., & Plestan, F. (2001). Asymptotically stable walking for biped robots: Analysis via systems with impulse effects.
*IEEE Transactions on Automatic Control*,*46*(1), 51–64.MathSciNetCrossRefzbMATHGoogle Scholar - Haddadin, S., Weis, M., Wolf, S., & Albu-Schäffer, A. (2011). Optimal control for maximizing link velocity of robotic variable stiffness joints. In
*18th IFAC world congress*(pp. 6863–6871).Google Scholar - Hondo, T., & Mizuuchi, I. (2011). Analysis of the 1-joint spring-motor coupling system and optimization criteria focusing on the velocity increasing effect. In
*IEEE international conference on robotics and automation*(pp. 1412–1418).Google Scholar - Hurst, J. W., Chestnutt, J. E., & Rizzi, A. A. (2010). The actuator with mechanically adjustable series compliance.
*IEEE Transactions on Robotics*,*26*(4), 597–606.CrossRefGoogle Scholar - Hyon, S. H., & Emura, T. (2004). Energy-preserving control of a passive one-legged running robot.
*Advanced Robotics*,*18*(4), 357–381.CrossRefGoogle Scholar - Jafari, A., Tsagarakis, N. G., Vanderborght, B., & Caldwell, D. G. (2010). A novel actuator with adjustable stiffness (AwAS). In
*IEEE/RSJ international conference on intelligent robots and systems*(pp. 4201–4206)Google Scholar - Karssen, J. G. D., & Wisse, M. (2011). Running with improved disturbance rejection by using non-linear leg springs.
*International Journal of Robotics Research*,*30*(13), 1585–1595.CrossRefGoogle Scholar - Kulchenko, P., & Todorov, E. (2011). First-exit model predictive control of fast discontinuous dynamics: Application to ball bouncing. In
*IEEE international conference on robotics and automation*(pp. 2144–2151).Google Scholar - Li, W., & Todorov, E. (2007). Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system.
*International Journal of Control*,*80*(9), 1439–1453.MathSciNetCrossRefzbMATHGoogle Scholar - Long, A. W., Murphey, T. D., & Lynch, K. M. (2011). Optimal motion planning for a class of hybrid dynamical systems with impacts. In
*IEEE international conference on robotics and automation*(pp. 4220–4226)Google Scholar - Mitrovic, D. (2010).
*Stochastic optimal control with learned dynamics models*. PhD Thesis, The University of Edinburgh.Google Scholar - Nakanishi, J., & Vijayakumar, S. (2012). Exploiting passive dynamics with variable stiffness actuation in robot brachiation. In
*Robotics: Science and systems*(pp. 305–312).Google Scholar - Nakanishi, J., Fukuda, T., & Koditschek, D. E. (2000). A brachiating robot controller.
*IEEE Transactions on Robotics and Automation*,*16*(2), 109–123.CrossRefGoogle Scholar - Nakanishi, J., Rawlik, K., & Vijayakumar, S. (2011). Stiffness and temporal optimization in periodic movements: An optimal control approach. In
*IEEE/RSJ international conference on intelligent robots and systems*(pp. 718–724).Google Scholar - Posa, M., Cantu, C., & Tedrake, R. (2014). A direct method for trajectory optimization of rigid bodies through contact.
*International Journal of Robotics Research*,*33*(1), 69–81.CrossRefGoogle Scholar - Radulescu, A., Howard, M., Braun, D. J., & Vijayakumar, S. (2012). Exploiting variable physical damping in rapid movement tasks. In
*IEEE/ASME international conference on advanced intelligent mechatronics*(pp. 141–148).Google Scholar - Rawlik, K., Toussaint, M., & Vijayakumar, S. (2010). An approximate inference approach to temporal optimization in optimal control. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, & A. Culotta (Eds.),
*Advances in neural information processing systems*(Vol. 23, pp. 2011–2019). Cambridge: MIT Press.Google Scholar - Rawlik, K. C. (2013).
*On probabilistic inference approaches to stochastic optimal control*. PhD Thesis, The University of Edinburgh.Google Scholar - Rosa, N., Jr., Barber, A., Gregg, R. D., & Lynch, K. M. (2012). Stable open-loop brachiation on a vertical wall. In
*IEEE international conference on robotics and automation*(pp. 1193–1199).Google Scholar - Saito, F., Fukuda, T., & Arai, F. (1994). Swing and locomotion control for a two-link brachiation robot.
*IEEE Control Systems Magazine*,*14*(1), 5–12.CrossRefGoogle Scholar - Shadmehr, R. (1990). Learning virtual equilibrium trajectories for control of a robot arm.
*Neural Computation*,*2*(4), 436–446.CrossRefGoogle Scholar - Sussmann, H. J. (1999). A maximum principle for hybrid optimal control problems. In
*Conference on decision and control*(pp. 425–430).Google Scholar - Tassa, Y., Erez, T., & Todorov, E. (2012). Synthesis and stabilization of complex behaviors through online trajectory optimization. In
*IEEE/RSJ international conference on intelligent robots and systems*(pp. 2144–2151).Google Scholar - Van Ham, R., Sugar, T. G., Vanderborght, B., Hollander, K. W., & Lefeber, D. (2009). Compliant actuator designs.
*IEEE Robotics and Automation Magazine*,*16*(3), 81–94.CrossRefGoogle Scholar - Van Ham, R., Vanderborght, B., Van Damme, M., Verrelst, B., & Lefeber, D. (2007). MACCEPA, the mechanically adjustable compliance and controllable equilibrium position actuator: Design and implementation in a biped robot.
*Robotics and Autonomous Systems*,*55*(10), 761–768.CrossRefGoogle Scholar - Xu, X., & Antsaklis, P. J. (2003). Quadratic optimal control problems for hybrid linear autonomous systems with state jumps. In
*American control conference*(pp. 3393–3398).Google Scholar - Xu, X., & Antsaklis, P. J. (2004). Optimal control of switched systems based on parameterization of the switching instants.
*IEEE Transactions on Automatic Control*,*49*(1), 2–16.MathSciNetCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.