Model predictive control for fast reaching in clutter
 2.9k Downloads
 8 Citations
Abstract
A key challenge for haptically reaching in dense clutter is the frequent contact that can occur between the robot’s arm and the environment. We have previously used singletimestep model predictive control (MPC) to enable a robot to slowly reach into dense clutter using a quasistatic mechanical model. Rapid reaching in clutter would be desirable, but entails additional challenges due to dynamic phenomena that can lead to higher forces from impacts and other types of contact. In this paper, we present a multitimestep MPC formulation that enables a robot to rapidly reach a target position in dense clutter, while regulating wholebody contact forces to be below a given threshold. Our controller models the dynamics of the arm in contact with the environment in order to predict how contact forces will change and how the robot’s end effector will move. It also models how joint velocities will influence potential impact forces. At each time step, our controller uses linear models to generate a convex optimization problem that it can solve efficiently. Through tens of thousands of trials in simulation, we show that with our dynamic MPC a simulated robot can, on average, reach goals 1.4 to 2 times faster than our previous controller, while attaining comparable success rates and fewer occurrences of high forces. We also conducted trials using a real 7 degreeoffreedom (DoF) humanoid robot arm with wholearm tactile sensing. Our controller enabled the robot to rapidly reach target positions in dense artificial foliage while keeping contact forces low.
Keywords
Clutter Haptic MPC MulticontactList of symbols
 \({\varvec{A}}_d, {\varvec{B}}_d\)
Statespace matrices that are discrete time linear approximations of the dynamics for a robot in contact
 \({\varvec{C}}({\varvec{\dot{q}}},{\varvec{q}})\)
Coriolis and centrifugal matrix
 \({\varvec{d}}_{grav}\)
An integral term added to the cost function to counter errors in gravity compensation
 \({\varvec{e}}_{int}\)
Approximate integral of end effector position error
 \({\varvec{e}}_{o}\)
Current error in end effector position
 \({\varvec{f}}_{i}^{ext}\)
External contact force vector on robot links
 \({\varvec{f}}_{i}^{measured}\)
Measured normal force for contact i
 \(f_{threshold}\)
Userdefined allowable contact force threshold
 \({\varvec{F}}({\varvec{\dot{q}}})\)
Coulomb and viscous joint friction
 \({\varvec{G}}({\varvec{q}})\)
Configuration dependent gravity joint torques
 \({\varvec{\hat{G}}}({\varvec{q}})\)
Estimate of configuration dependent gravity joint torques used for gravity compensation
 \(H_u\)
Number of time steps in the prediction model where there is control authority
 \(H_y\)
Number of time steps in the prediction model with the control input set to zero
 \({\varvec{J}}_{ee}\)
Geometric Jacobian at the end effector
 \({\varvec{J}}_{c_i}\)
Geometric Jacobian at contact i
 k
Discrete time index
 \({\varvec{K}}_{c_i}\)
Cartesian stiffness matrix for contact i
 \(k_i\)
Gain on the error integrator, \({\varvec{e}}_{int}\)
 \({\varvec{K}}_{p}\)
Diagonal joint stiffness matrix
 \({\varvec{K}}_{d}\)
Diagonal joint damping matrix
 \({\varvec{M}}({\varvec{q}})\)
Configuration dependent jointspace inertia matrix
 m
Number of degrees of freedom of given robot linkage
 \({\varvec{\hat{n}}}_{c_i}\)
Unit vector normal to the surface of the robot at the location of contact i
 N
The number of contacts at any given time instant
 \({\varvec{q}},{\varvec{\dot{q}}}\)
State variables of joint angle and velocity
 \({\varvec{q}}_{eq}\)
Commanded equilibrium joint angles that are sent to the joint impedance controller
 \({\varvec{q}}_{min}\)
Minimum allowable joint angle limits
 \({\varvec{q}}_{max}\)
Maximum allowable joint angle limits
 \({\varvec{q}}_{o}\)
Initial joint configuration at current time
 \(t_{o}\)
Current time which is always the starting point for the predictive model
 \({\varvec{x}}_{c_i}\)
Cartesian position for contact i
 \({\varvec{x}}_{ee}\)
Cartesian position of the end effector
 \(\alpha , \beta , \mu , \zeta \)
Scalar weighting terms for the multiobjective cost function
 \(\varDelta f_{rate,i}\)
Maximum desired rate at which the contact force should be allowed to change at contact i
 \(\varDelta {\varvec{q}}_{eq}\)
Change in equilibrium joint angles, this is the output of MPC
 \(\varDelta {\varvec{q}}_{max,eq}\)
Maximum allowable change in commanded joint angles
 \(\varDelta t_{impact}\)
Time duration of an unexpected impact
 \(\varDelta {\varvec{x}}_{des}\)
Desired change in position at the end effector
 \(\varDelta t_d\)
Size of continuous time step used to generate discretetime difference equations for prediction and dynamic MPC
 \({\varvec{\tau }}_{ext}\)
Joint torques that result from the sum of all external forces due to contact
 \({\varvec{\tau }}_{control}\)
Commanded joint torque that results from “simple joint impedance control” and gravity compensation calculations
 \({\varvec{\tau }}_{impact}\)
Average torque due to impact forces occurring during an unexpected collision
 \(x_{th}\)
The integral term \({\varvec{d}}_{grav}\) becomes effective when \({\varvec{e}}_0\) is below this threshold value
1 Introduction
The current state of capabilities for robot manipulation in domains such as inhome assistance, search and rescue, and natural or military disasters lags behind human capability and speed. One particular capability at which humans (and other animals) excel is using haptic feedback to operate effectively in clutter. Robots with this capability could potentially perform tasks better in constrained and dynamic scenarios such as reaching into containers or cupboards without lineofsight, performing search and rescue in debris, or working alongside human coworkers.
In this paper, we specifically consider the problem of a robot arm (i.e., a serial manipulator) reaching into clutter in order to move its end effector to a target position. We define success to be when all contact forces that occur are low and the end effector attains a position close to the target. Success does not depend on the orientation of the end effector. In contrast to our previous research, we focus on enabling the robot to reach the target position in a short amount of time. As such, dynamic phenomena, such as inertia and impact forces, play an important role.
Model predictive control (MPC) offers a promising approach to handling the multiple objectives and constraints associated with reaching in clutter. As we have previously discussed in Jain et al. (2013), while reaching in dense clutter, a robot is likely to make contact between its arm and the environment at multiple locations. Moreover, the robot is unlikely to anticipate each contact event before it occurs. In Jain et al. (2013), we presented a singletimestep model predictive controller that used a quasistatic mechanical model of the robot in contact with the environment. This controller, which we refer to as quasistatic MPC, achieved good empirical performance across a variety of circumstances in terms of enabling robots to successfully reach target locations while keeping contact forces low. Quasistatic MPC assumed that the robot had wholearm tactile sensing and compliant joints. Our results also provided evidence that, when reaching in clutter, wholearm tactile sensing and jointtorque sensing together enable superior performance compared to jointtorque sensing alone or jointtorque sensing with a force–torque sensor on each link for contact force sensing.
However, the quasistatic model used by quasistatic MPC did not account for dynamic properties, such as link inertia and joint damping. As expected, the robot performed best when moving at low velocities for which the dynamics become negligible. Many tasks would benefit from faster robots, but effectively controlling forces from multiple intermittent and unexpected contacts presents a substantial challenge. Faster end effector velocities tend to result in both higher impact forces and higher forces from persistent contact. Even intentionally slow motion with compliant joints can result in dynamic phenomena, such as when an arm with a preloaded compliant joint slips off of an object, allowing the joint to release its stored energy and thereby accelerate the robot links.
In this paper, we present a controller that enables a robot manipulator to rapidly move its end effector and simultaneously control contact forces in the presence of multiple contacts along the entire robot arm. We present a multitimestep model predictive controller that models dynamic phenomena relevant to the task of reaching in clutter. This controller, which we refer to as dynamic MPC, models the dynamics of the arm in contact with the environment in order to predict how contact forces will change and how the robot’s end effector will move. Among other factors, the model considers link inertia and damping at the robot’s joints. Dynamic MPC also models how joint velocities will influence potential impact forces based on a collision model. This enables the robot to select joint velocities that mitigate potential impact forces. At each time step, dynamic MPC uses linear models that locally approximate the nonlinear dynamics in order to generate a convex optimization problem that it can solve efficiently.

Contact that results in forces below a user defined threshold is acceptable.

The robot has some form of compliance at its joints (either active or passive).

Tactile sensors cover the entire surface of the robot arm.
We have organized the paper as follows: Sect. 2 discusses related prior research. Section 3 covers the mathematical formulas and assumptions for formulating our controller. Section 4 describes our software, hardware and experimental setup. In Sect. 5, we show evidence for the accuracy of our simplified dynamic model over a short, but useful, time horizon. In Sect. 6 we present and discuss results from comparing quasistatic MPC and dynamic MPC in our software simulation testbed. We present results from a global reaching task with a real robot in Sect. 7. We present results from local tests that emphasize aspects of contact force control on a real robot with groundtruth force–torque data in Sect. 8. We use the term global reaching task to refer to a trial for which the robot’s end effector reaching a target location is an important measure of success. We use the term local test to refer to a trial that involves the robot maneuvering for a short period of time over a short distance without consideration of whether or not the robot’s end effector attains a target position. We conclude with a discussion of applications and an overview of our results in Sects. 9 and 10.
2 Related work
We first presented our dynamic controller in a conference paper published at Humanoids 2013 (Killpack and Kemp 2013). The conference paper only conveyed results from two simulated threelink planar robots. In this journal article, we present extensive results from dynamic MPC running on a real 7 DoF humanoid robot arm. In addition, we provide a more thorough formulation of the controller, details about how we altered the controller to run on the real robot, and an empirical evaluation of the fidelity of the controller’s approximate linear models.
Many common approaches to robotic manipulation are poorly matched to the challenges of reaching in dense clutter. Methods often rely on collisionfree arm motion, lineofsight sensing of the volume to be traversed, or detailed geometric models prior to reaching (Dogar and Srinivasa 2011; Hornung et al. 2012; Kavraki and laValle 2008; Leeper et al. 2013a, b; Saxena et al. 2008, 2011; Srinivasa et al. 2009; Stilman et al. 2007). Most robotic manipulation research has emphasized avoiding contact except at the end effector (Guo and Hsia 1993; Katz et al. 2013; Latombe 1990; Stilman 2010) or other single point contact locations (De Luca et al. 2006; De Luca and Ferrajoli 2008), even as the robot attempts to operate in unstructured environments. Such restrictions on contact unnecessarily limit a robot’s actions when lowforce contact is benign and allowable for a given task. Our approach allows multiple contacts along the entire arm surface and uses only haptic feedback to reach through unknown environments.
Work in Erez and Todorov (2012) shows that contact along the arm may be permissible when performing a task, but requires detailed geometric models of the environment, assumes rigid contact, and does not use sensor feedback during the tasks. While results in Mordatch et al. (2012), and Mordatch et al. (2012) use an optimal control formulation and explicit contact modeling to perform multicontact tasks, but produce openloop trajectories and do not use online haptic feedback.
The majority of robotics research on unwanted or unmodeled collisions has focused on reacting to impacts after collision has occurred (De Luca et al. 2006; De Luca and Mattone 2004; Haddadin et al. 2008). This includes work that quantifies forces during impact (Phan et al. 2011) using novel sensing technology and work that models the instantaneous stiffness effects during collision (Shin et al. 2011). Extensive research has aimed to quantify the potential for personal injury from robothuman collisions (Haddadin et al. 2011) and some work has been done to limit robot joint velocities accordingly (Haddadin et al. 2012). In our work, we use an impactmomentum model in our cost function to regulate joint velocities to mitigate contact forces from unexpected impacts without limiting all joint velocities uniformly.
In regards to our approach for control, one of the earliest application areas for MPC was chemical process control (Garcia et al. 1989). MPC is also often referred to as receding horizon control and has been used in work on the control of aerial vehicles (Abbeel et al. 2010; Bellingham et al. 2002). MPC has also been used in robot locomotion research (e.g, Erez et al. 2012; Manchester et al. 2011; Wieber 2006). In terms of robot manipulation, MPC has recently been used in applications such as bouncing a ball (Kulchenko and Todorov 2011), generating manipulator trajectories to compensate for inertial forces on a boat (From et al. 2011), controlling a 6 DoF cabledriven parallel manipulator (Duchaine et al. 2007), and reaching in free space (Ivaldi et al. 2010).
3 Model predictive control formulation
In this section, we first explain the architecture of our lowlevel control framework. We then present the mathematical models our controller uses for predicting the motion and contact forces for our robot. For the prediction step of the controller, we use a forward, discretetime prediction model. We subsequently show the form of our model predictive controller. Details about our previous work with quasistatic MPC, against which we compare performance with dynamic MPC, can be found in Jain et al. (2013).
The list of symbols at the beginning of the paper summarizes the nomenclature we will use. Lower case variables that are bold face are vectors, while upper case variables that are bold face are matrices, and any nonbold face variable is a scalar.
3.1 Overall controller structure
While our implementation is mathematically analogous to proportional derivative control (PD control), the semantics are different. For example, the equilibrium angles, \({\varvec{q}}_{eq}\), are not desired joint angles, and a difference between the equilibrium angles and the current joint angles, \({\varvec{q}}\), is not necessarily error. Likewise, \(K_{p}\) and \(K_{d}\) are more aptly thought of as defining a desirable mechanical impedance for the joints (e.g., high compliance for our system) rather than as proportional and derivative gains to reduce errors over time.
3.2 Robot equations of motion
In order to use MPC, we need models to predict the change in the robot’s state given control inputs and disturbances. In the following sections, we derive nonlinear contact and robot state models for a serial manipulator in contact with its environment. We then present linear contact and robot state models that approximate these nonlinear models and which can be used for dynamic MPC.
3.2.1 Contact model
Our system models each contact as a linear spring that only applies force normal to the surface of the robot’s arm. Any forces tangent to the surface of the arm are ignored. At each time step, the system creates a new mechanical model with a linear spring at each location on the arm at which contact has been detected. In our implementation, the robot detects contact when a measured contact force exceeds a threshold.
3.2.2 Linear robot model
For our dynamic robot model, we neglect joint friction (\({\varvec{F}}({\varvec{\dot{q}}}) = 0\)). If viscous joint friction were significant, we could incorporate it into our model of joint damping in the simple impedance controller.
Section 6 shows that using a horizon of 5 with dynamic MPC outperformed quasistatic MPC across 4800 simulated trials in terms of speed, success rates and regulation of maximum contact forces. Section 5 shows that our linear model can predict the future state of a robot with low error.
In Sect. 5, we provide empirical support for the fidelity of our linear model, and our evaluations of dynamic MPC demonstrate that this model can be used effectively. The approximations we made result in a computationally efficient formulation that is straightforward to implement. Conventional linearization would require the calculation of numerous derivatives at each time step, which would be complicated by the number of contacts changing over time.
3.2.3 Impulse–momentum Impact Model
With this model of impulse–momentum between a contact force and joint torque, the worst case scenario for impact location for joint j is when \(d_{impact}\) approaches zero. Figure 4 shows a visual representation of how \({\varvec{\tau }}_{impact}\) is calculated for the first joint. From Fig. 4, it is clear that as \(d_{impact}\) goes to zero, \(f_{threshold}\) must go to infinity to give the same average torque. However, letting \(d_{impact}\) go to zero is unnecessarily conservative for formulating a joint velocity constraint. We wish to avoid the frequent occurrence of high impact forces, but we are not designing the controller for the worst case scenario. To achieve good empirical performance, we tuned \(\varDelta t_{impact}\) for a robot while holding \(d_{impact}\) constant. Since \(\varDelta t_{impact}\) and \(d_{impact}\) are both constant scalars that only appear in this equation, this is equivalent to tuning their product.
Our impulse–momentum constraint described in Eq. 25 does not take into account joint compliance. In addition, contact with some link geometries could cause \(d_{impact}\) to be small and result in high forces. However, our evaluations suggest that this collision model is effective in practice. This model also has the advantage of being linear.
3.3 Limits and saturation in robot model
A benefit of using MPC is that the robot can plan over a short time horizon, adhering to physical constraints on state and input variables in addition to other user defined constraints (such as force thresholds using models in Sect. 3.2.1). In terms of physically meaningful constraints, we include limits on physical joint angles and high joint velocities (as presented in Sect. 3.2.3). We also include a model of actuator saturation. This takes one of two forms, either (1) limiting the amount that the desired joint angle can be changed at each time step (see Eq. 34), or (2) limiting the total torque each joint can apply (see Sect. 7.2).
3.4 Dynamic model predictive controller
Using the equations of motion from Eq. 20, along with the contact models and other constraints, we created a model predictive controller that accounts for dynamics (dynamic MPC). A major difference from quasistatic MPC (see Jain et al. 2013), which used a single time step time horizon, is that dynamic MPC has a horizon that consists of multiple time steps. Dynamic MPC uses a horizon of length \(H_u\) during which the controller can apply control effort and a subsequent horizon of \(H_y\) time steps during which the controller applies no control effort. This is a common approach to improve stability and robustness and can be found in Rossiter (2003). The values we used for \(H_u\) and \(H_y\) are presented in relation to specific tests in subsequent sections. The actual physical time that the controller predicts into the future depends on the rate (\(\frac{1}{\varDelta t_d}\)) at which the controller runs. For example, using a total time horizon of ten discrete steps, a time step \(\varDelta t_d=0.01\) s would result in the controller predicting up to 0.1 s into the future, and would imply that the controller expects to run at 100 Hz.
In our original formulation, we defined constraints on the allowable contact force. However, due to feasibility issues with convergence of the optimization, we removed these constraints and added terms related to contact forces to the cost function. The constraints in our current formulation consist of the discrete dynamic equations (Eqs. 30 and 31), limits on joint actuation and angles (see Eqs. 32, 33, 34) and the joint velocity constraint described in Sect. 3.2.3, (see Eq. 35). The actuation limits described in Eq. 34 are for the simulation testbed. For the fullsized real robot DARCI, this actuator model takes a different form to limit the torques applied at the joints. This form is described in Sect. 7. The variable \(\varDelta x_{des}\) is a waypoint that is a fixed step size in a straight line towards the target position unless the distance to the target position is smaller than the nominal step size. Equation 35 describes the impulse–momentum constraint from Sect. 3.2.3. The result of this optimization at each time step is a series of \(\varDelta {\varvec{q}}_{eq}\) values from which the controller executes only the first.
We used an optimization tool named CVXGEN to generate efficient C code that solves this optimization problem (see Mattingley and Boyd 2009, 2010, 2012 for details about this tool for webbased convex optimization code generation). Our system uses Python code to generate the data required by this C code. For the average number of contacts experienced during our trials, this Python code executed in around 1–4 ms. The C code solver requires arguments with fixed dimensions. It solved this optimization problem with multiple contacts and a multistep horizon in around 4–10 ms, depending on the number of degrees of freedom of the robot being controlled.
3.5 Evaluation of controller
Our strategy in evaluating our controllers is based on empirical measures of success at reaching a specified goal in clutter while keeping the contact forces low. Due to the nature of clutter, robots will not always succeed at reaching a target and the contact that occurs can vary widely. To represent this variability, we randomly generated simulated environments and representative real physical environments to represent the complexity of real clutter. We then used success rates, contact forces, and timetocomplete to evaluate dynamic MPC and compare it with our prior work. Details of quasistatic MPC, which we use as a baseline comparison in simulation can be found in Jain et al. (2013) and Killpack (2013).
Infrastructure and testbeds that we used to test our controller include two different software simulation testbeds, and a humanscale robot with torso, mobile base, and two 7 degreeoffreedom arms. Section 4 describes these testbeds in detail.
4 Testbeds and infrastructure
4.1 Software simulation testbeds
We developed two simulation testbeds for controller development and testing. We first performed tests using the MATLAB Robotics Toolbox (Corke 1996) which gives more control over the numerical integration methods and the contact models that are used (for more details see Killpack 2013; Killpack and Kemp 2013). We also used the Open Dynamics Engine (ODE Smith et al. 2011), which is an open source physics SDK. Both simulations included a simple joint impedance controller running at 1 kHz. They also both simulated the same three link arm described in Jain et al. (2013), Killpack (2013), and Killpack and Kemp (2013). This arm has kinematics, mass and joint limits similar to a human holding a hand outstretched and manipulating in a plane parallel to the ground at shoulder height (see Fig. 5). The three joints had stiffnesses of 30, 20, and 15 Nm/rad, and damping values of 15, 10, 8 Nms/rad, which correspond to the torso, shoulder, and elbow joints.
4.1.1 MATLAB simulation using robotics toolbox
We used our MATLAB simulation testbed to prototype model predictive controllers with the dynamic models presented in Sect. 3.4. This higher fidelity simulation was valuable for testing the fidelity of our dynamic model as we found that although ODE was better at simulating large numbers of contact, it had lower numerical stability and integration accuracy (see Killpack 2013).
Our MATLAB implementation used an explicit springdamper model for simulating contact. Since there is no native geometry collision library with MATLAB, we also simulated discrete tactile sensing elements (taxels), that were spaced one centimeter apart along the center of each link to which our implementation assigned any simulated contact force. All objects were assumed to have a 1.5 cm radius and a springdamper contact was simulated when the extent of the arm (which was assumed to have a 1.5 cm radius) was within the extent of the object. Objects were simulated as having a stiffness of 5000 N/m and damping value of 10 Ns/m for the tests in Sect. 5.
4.1.2 Open source dynamics engine (ODE)
The ODE software testbed allowed us to simulate a large number of trials and prototype our controllers in highly cluttered workspaces. This was particularly beneficial as we could generate large data sets to test controller performance.
4.2 Real robot platform: DARCI
We used a humanoid mobile manipulator called DARCI, a Meka M1 Mobile Manipulator (see Fig. 6). DARCI has two 7 degreeoffreedom arms that have serieselastic actuators (SEAs), are torque controlled and include gravity compensation. Joint stiffness and damping gains for the simple joint impedance controller are 43, 43, 43, 43, 2.6, 3.4, 3.4 Nm/rad and 2.6, 4.3, 0.64, 0.64, 0.064, 0.090, 0.090 Nms/rad where they are listed from the most proximal to the most distal joint. The first three values in each list correspond to the shoulder, the fourth value to the elbow, and the last three values to the wrist. We used ROS to send commanded joint angles (\({\varvec{q}}_{eq}\)) to the lowlevel impedance controller. The computer we used to run our MPC solver in real time had a 32bit Ubuntu operating system with 16 GB of RAM and a 3.40 GHz Intel Core i73770 CPU. The solver only used a single core, running as a single process.
4.3 Tactile sensing hardware
We describe specifics of our tactile sensor implementation in Bhattacharjee et al. (2013). The tactile sensor we used on DARCI is a sleeve pulled over the end effector, wrist, and forearm, that consists of five layers of fabric. An inner and outer layer that protect and insulate the sensor’s interior, and two layers of conductive fabric that go on either side of a layer of electrically resistive fabric that decreases resistance as the pressure on the cloth increases. One of the conductive layers is split into separate rectangles; where each rectangle is a taxel. The sleeve has 25 taxels total consisting of a single taxel at the tip of the end effector and an array of 24 taxels with 4 taxels around the arm’s circumference and 6 along the arm’s length. When a taxel detects contact, the system uses the geometric center of the taxel as the location of contact. Figure 6 shows the tactile sensor on the left arm of DARCI.
We calibrated the tactile sensor in order to convert the raw sensor measurements to forces in Newtons. We fit an exponential curve to a plot of tactile sensor readings versus ground truth from a force–torque sensor when pushing on a taxel with various forces. We used the calibration curve from one taxel on all taxels. Although the values reported by these sensors vary in complex ways based on other parameters such as contact area, pressure, and hysteresis, when quasistatic MPC (see Bhattacharjee et al. 2013) and dynamic MPC (see Sect. 8) have used these values they have also performed well with respect to groundtruth forces.
In this paper, our systems detected contact when a tactile sensor’s measurements exceeded a threshold. This threshold was 0.5 N for the simulated robot and 0.2 N for the real robot.
5 Dynamic model prediction accuracy
The performance of MPC is limited by the quality of the model used to make predictions. For dynamic MPC, it uses a linear approximation of the nonlinear robot dynamics. This approximation assumes that over the prediction horizon changes to the mass and Coriolis matrices are negligible and can thus be treated as constants (see Sect. 3.2.2). In this section we examine the accuracy of our simplified dynamic model for the simulated threelink planar arm in MATLAB as described in Sect. 4.1.1.
5.1 Testing accuracy of open loop dynamic model prediction
For both sets of trials, at every time step, we solved the model predictive control optimization problem. This solution results in a control sequence over the control horizon (\(H_u\)) as well as a prediction of how that control sequence will change the joint velocities and angles over that same horizon. At each time step we simulated the motion of the arm using our linear model as if we were to apply all of the control steps over the entire control horizon instead of just the first step. Then, using the full nonlinear equations, we also simulated and compared the actual change in joint angles and velocities to the change predicted by our linear dynamic model. After this simulation over just the horizon (\(H_u\)), we would reset the state of the arm as if we had only taken the first control step and repeat the process for the next time step. This means that as the arm progressed towards the goal using MPC (executing only the first control input), we also simulated with both our linear and nonlinear models five steps into the future to check the error in our linear prediction model.
In order to quantify the error of the prediction model, we group measurements from all three joints and looked at statistics on the error from the actual and predicted joint angles across the horizon of 5 control steps (\(H_u = 4\) since the implementation starts with an index of 0).
5.1.1 Prediction of arm moving in free space
5.1.2 Prediction of arm moving in clutter with contact
The error in the prediction is higher relative to the actual change in joint position than it was for movement in free space. However, the error is still much smaller than the actual change.
The error in the prediction is more apparent when moving in contact than when moving in free space. Although the error and the variance are larger at the the end of the horizon (step 5), when dynamic MPC is actually running it updates its contact information at every time step, which would reduce errors due to contact being made or broken.
In this section, we have shown that using our dynamic model, the change in predicted joint angles correlates strongly with the actual change in the joint angles. We also have shown that the error in our predictions due to our linear approximations is small relative to the actual change in the robot state.
6 Dynamic versus quasistatic MPC for reaching in simulated clutter
In this section we compare dynamic MPC against quasistatic MPC from Jain et al. (2013) in ODE using the testbed shown in Fig. 5.
In order to compare the performance of our controllers, we ran four sets of 1,200 trials each in ODE for our dynamic controller and for quasistatic MPC. We generated these four sets by varying the density of clutter between 20 fixed objects and 80 fixed objects, and the value of the force threshold between 5 and 25 N. For quasistatic MPC, we used the same data for the 5 N threshold that was reported in Jain et al. (2013). However, for the 25 N threshold, we also generated new results for quasistatic MPC.
Our initial attempts at tuning the parameters of dynamic MPC involved setting the gain on the position cost term first (Eq. 26) and then trying to manually vary the other parameters to achieve desirable performance. However, our multiobjective cost function along with other tunable parameters such as the waypoint magnitude size made tuning the controller difficult and sometimes unintuitive. In order to search the parameter space for our controller, we used simulated annealing on 15 randomly selected environments in ODE (5 with 20 fixed objects and 10 with 80 fixed objects) that were not part of the data set we used for evaluating the controllers. We ran these trials with two different maximum desired force thresholds of 5 and 15 N, and accumulated the cost across all 30 trials. Our cost function for simulated annealing was based on four terms: a cost on the time to complete a trial; a cost on forces over the threshold; a cost on the maximum force in a trial; and a cost on failure to reach the goal. This optimization resulted in a Pareto front that gave us a better intuition for the tradeoff between control parameters while at the same time improving performance significantly. We tuned the weightings between the cost terms such that the trade off (especially between costs on force and time to complete) caused the optimization to more equally explore the Pareto front while searching mostly in the space where success rates remained on the same order of magnitude. More details on this tuning with a nine dimensional parameter space are discussed in Killpack (2013).
For quasistatic MPC, we used the nominal tuning from Jain et al. (2013) which is one limitation of our comparison since further tuning might improve performance.
For all of the simulation trials we set the control horizon \(H_u = 4\) and prediction horizon \(H_y = 1\) according to their description in Sect. 3.4. The parameter \(d_{impact}\) as described in Sect. 3.2.3 was set to 0.02 m for all joints which is the diameter of a single cylinder from our simulation testbed. The task for all simulations was specified as reaching to a goal location and we used the same stopping criterion for success or failure as used in Jain et al. (2013), and Killpack and Kemp (2013). The trial was deemed a success if the arm reached the goal within 2 cm. It was deemed a failure if the max forces were greater than 100 N or the time took longer than 100 s.
Summary statistics for comparing success rate, contact force control and time to reach the goal between dynamic and quasistatic MPC at different densities of clutter and for different force thresholds
Low clutter (20 objects)  High clutter (80 objects)  

Dynamic model  Quasistatic model  Dynamic model  Quasistatic model  
High force threshold (25 N)  
Success rate  \({\varvec{74}}.{\varvec{6\,\%}}\)  72.3 %  \({\varvec{24}}.{\varvec{6\,\%}}\)  23.3 % 
99 percentile contact force value (N)  28.3  27.6  28.5  27.8 
(1/100 chance of this force)  
99.9 percentile contact force value (N)  32.3  36.1  34.1  39.8 
(1/1000 chance of this force)  
Maximum force in alltrials  38.2  74.6  39.5  196.4 
Avg. time to complete (s)  10.9  22.3  14.2  21.1 
Low force threshold (5 N)  
Success rate  \({\varvec{80}}.{\varvec{9\,\%}}\)  77.3 %  \({\varvec{30}}.{\varvec{1\,\%}}\)  28.3 % 
99 percentile contact force value (N)  6.5  7.9  7.1  8.6 
(1/100 chance of this force)  
99.9 percentile contact force value (N)  9.9  21.2  10.6  22.0 
(1/1000 chance of this force)  
Maximum force in all trials  21.0  68.2  22.6  122.1 
Avg. time to complete (s)  12.8  22.3  14.6  21.1 
6.1 Comparison results from ODE simulation
6.1.1 Success rates
Dynamic MPC achieved equal or higher success rates than the quasistatic MPC for all force threshold and clutter settings. In addition, we used a standard significance test for comparing two proportions (McClave et al. 2008) and showed that for two of the four test settings, dynamic MPC’s higher success rate was statistically significant with a pvalue less than 0.05.
6.1.2 Contact force regulation
For the 99th percentile contact force value (meaning that 1/100 forces measured above the noise threshold was this high or higher), the results were comparable for the two controllers. Figure 12 shows the calculated percentage of contact forces in the yaxis that were above the force value shown on the xaxis for both clutter and force threshold settings. Both controllers are able to successfully affect the measured contact forces according to the force threshold specification. Another way to compare the controllers is to look at events that occur comparatively infrequently, but can have catastrophic effects depending on the specific application. One example of this is extremely high forces. Table 1 shows the value for forces that occur at least once in 1000 force measurements (which at 100 Hz sampling rate is occurring every 10 s that the arm is in contact) as well as the maximum overall measured force. For these two forms of evaluation, dynamic MPC does better in all four settings (high/low clutter and high/low force threshold).
6.1.3 Comparing speed of robot end effector
Table 1 also shows that in all cases dynamic MPC completes the task faster than quasistatic MPC. The speed increase using dynamic MPC for the task ranges from 1.45 to 2.04 times faster than quasistatic MPC. This means that while dynamic MPC is at times moving up to twice as fast as quasistatic MPC, it is still successfully controlling the force and reaching the goal as seen in the previous sections.
6.2 Limitations of software simulation results
For the simulation results in this section, we have reported results on fixed, stiff objects only. This is in part because fixed, stiff objects initially proved more problematic for controlling the forces using a dynamic model. However, we noticed that for the movable objects that we had used in Jain et al. (2013), dynamic MPC also moved too slowly and conservatively. This is likely due to the contact model where we used a fixed, high stiffness value to represent each contact. Interestingly, in the experiments that we conducted with the real robot, DARCI, the controller performed in foliage environments successfully moving leaves and branches (see Sect. 7). This may indicate that this issue is less important in real world scenarios.
7 Results for global task of reaching in clutter using the dynamic model predictive controller with a real robot
In this section we present results using dynamic MPC on the real robot DARCI. We show the ability of dynamic MPC to reach randomly generated goal locations in the clutter. We also show that according to the tactile sensor measurements, the arm is able to control its forces.
7.1 Cluttered environment used for testing
The environment that we used for testing reaching in clutter was the same as in Bhattacharjee et al. (2013). This environment consists of plastic leaflike vegetation and solid wooden trunks as can be seen in Fig. 14. Certain sections of the workspace with leaflike vegetation can be difficult to push through if the robot respects an allowable contact force threshold. This is due to the density and stiffness of the plants. Figure 14 also shows the coordinate frame used, which is centered between the two robot arms and oriented as shown.
7.2 Dynamic MPC adaptation for DARCI
Adapting the controller used in simulation in Sect. 6 to run on DARCI involved a few changes. We used the same controller parameters that we identified for simulation except for \(\varDelta t_{impulse}\) and the waypoint magnitude size. These we tuned empirically by first setting the waypoint magnitude size without the joint velocity constraint being active. Then, we tuned \(\varDelta t_{impulse}\) until the impact forces, as measured by our tactile sensing skin, were close to the allowable force threshold.
For this controller, we set the control horizon \(H_u = 2\) and prediction horizon \(H_y = 3\) which gives three time steps for control (since \(H_u\) begins at zero) and then predicts the output for another four steps (since the controller predicts the state up to \(H_y + 1\)). The convex optimization problem was generated by CVXGEN (Mattingley and Boyd 2012). Although the optimization could solve an MPC problem at a rate between 50 and 100 Hz, we ran the controller at 25 Hz with 40 ms time steps. We did this because jitter in the control loop due to using a nonrealtime operating system with Python and ROS required that the controller run at a lower rate to get consistent timing in the control loop. Note that as long as the dynamic model used in MPC does not go unstable during our experiments because of the size of the time step, our discretization incorporates the time step into the model implicitly. For testing on DARCI, we also moved the constraint that describes our impulse–momentum model (see Eq. 35) into the cost function because noise on our joint velocity signal was causing the optimization to be infeasible when any joint was operating near its constrained joint velocity value.
7.3 Controller evaluation for reaching in clutter
To generate goal locations for reaching in the simulated foliage with DARCI, we first estimated the workspace of the arm in the foliage. We extended the arm as far as possible forward in the xy plane (according to Fig. 14) while moving through the workspace from left to right and recording the end effector positions. This traced out an approximate semicircle that covered most of the foliage. We then randomly sampled from a uniform distribution over x and y using this approximate semicircle in the xy plane while also randomly sampling from a uniform distribution in z between 5 and 30 cm above the ground plane of our testbed in the zdirection to determine goal locations. For testing, the robot reached into the clutter and recorded data for 105 reaches. Figure 13 shows DARCI reaching into the foliage. Extension 1 is a video that shows multiple successful and unsuccessful reaches. For each trial, the arm attempted to reach the goal for 20 s before classifying the attempt as success or failure. Success was determined as when the end effector was within 4 cm (or approximately 1.5 inches) from the desired goal.
7.4 Results in terms of success rates, contact forces and speed
The average time to complete for all of the successful trials was 3.0 s (noting that the end effector traveled an average distance of 35 cm across these same trials looking at the total end effector path traveled). The average velocity at the end effector for successful trials was 22.2 cm/s and for all trials was 12.5 cm/s. In these trials we found that the end effector moved on average 7.5 times faster for successful trials than velocities reported for quasistatic MPC on the robot Cody (Jain et al. 2013).
Table 2 summarizes the results and contains the values that are relevant to our task of reaching in clutter with dynamic MPC.
8 Results with local tests for reaching in clutter
Section 7 showed that our system enabled a robot to do fast reaching in clutter while keeping forces low. The trials from previous sections entailed examining a global task of reaching a Cartesian goal location. For this section, we performed tests to analyze the effect of varying the allowable contact force while measuring groundtruth data from force–torque sensors. On the real robot, we performed short trials representative of situations that resulted in high forces or failure in simulation.
Results for dynamic MPC reaching in foliage
Dynamic MPC  

Success rate  85.7 % (90/105) 
Exceeded safety threshold (15N)  0 times 
Avg. max. of all contact forces  0.9 N 
Avg. max. of contact forces over \(f_{thresh}\)  8.8 N 
Avg. of all contact forces  1.6 N 
Avg. of contact forces over \(f_{thresh}\)  8.1 N 
Avg. time to complete all trials  5.5 s 
Avg. time to complete successful trials  3.0 s 
8.1 Evaluating single contact force control with tactile sensor
For these trials we varied the force threshold (\(f_{threshold}\)) between 5, 10 and 15 N. We ran 10 reaches for each direction (left to right and right to left) resulting in a total of 20 trials at each setting. The purpose of these tests was to verify that varying the force threshold (\(f_{threshold}\)) had the expected effect on the real robot.
Figure 18 shows the summary statistics for each set of 20 trials as we held \(\varDelta t_{impulse}\) fixed at 4 s while varying \(f_{threshold}\). We calculated the resultant force from the force–torque measurement and used its magnitude as ground truth for a contact force. The statistics that we plotted include the maximum contact force, and the 99th and 50th percentile contact force for each value of \(f_{threshold}\). The correlation coefficient between the force threshold value and the 99th percentile forces is 0.99975.
As expected, increasing \(f_{threshold}\) for the controller is positively correlated with the maximum and 99th percentile forces that we measured while reaching in contact even if our tactile sensor calibration had some error in terms of groundtruth forces.
8.2 Performance of dynamic MPC for multicontact scenarios
From a wide range of multicontact scenarios we found in simulation trials from our previous work (see Jain et al. 2013), we developed a set of multicontact trials. We examined the histograms of contact locations on the three link simulated arm shown in Fig. 5 and identified contact configurations that occurred frequently or resulted in high force. We then manually confirmed that the multicontact trials were representative of common configurations by randomly sampling and visualizing the actual contact configurations from the simulation trials represented in the contact histograms. These tests are thus subjectively defined, but based on objective data for thousands of tests in simulation.
For each set of trials, we present a figure showing a typical arm trajectory we observed and force data. The resulting force data from the force–torque sensors was the focus of these tests rather than success in reaching the goal.
For all of these multicontact trials we used \(f_{threshold} = 5 \mathrm{N}\) and \(\varDelta t_{impulse} = 4 \mathrm{s}\). We ran 20 trials for each multicontact task. The termination criterion for each trial was successfully reaching the goal or a timeout of 15 s. Note that in these trials it was common for two taxels on our skin sensor to make contact with the force–torque sensor simultaneously (especially at the wrist joint). When this happened, the force–torque sensor value could be up to double the allowed force threshold and our controller would still be successfully controlling skin sensor forces since it was measuring and controlling with respect to two separate contacts.
8.2.1 Multicontact Test 1
Figure 19 shows the progression of multicontact test 1 over time. In this test, the robot arm starts in contact the tip of the end effector on the left side and in the middle of the forearm on the right hand side. In the second image in the sequence of Fig. 19, the arm first pushes against both rods as it tries to move to the goal. In subsequent motions the arm uses out of plane motion to move around the post and still get to the goal. This is one of the trials that we expected to have large forces due to what we saw in the simulation. However, the 3D motion and extra degrees of freedom made it straight forward for our controller to reach the goal and control the forces at the same time. See Extension 3 for video of this trial.
8.2.2 Multicontact Test 2
For the force results of this trial we had higher forces than for the first test. It should be noted that this configuration was more difficult than the first trial as the posts were placed closer together and we expected to have high forces due to jamming in this situation. Most forces were still limited to be below 15 N. This also matches what our test in Sect. 8.1 showed where the 99th percentile contact force for a single contact was about 15 N and corresponded to a 5 N allowable force threshold on the tactile sensor (see Fig. 22).
8.2.3 Multicontact Test 3
8.2.4 Multicontact results discussion
These are numerical values that summarize the data presented in the force histograms for the multicontact test cases in this section
Test  99th percentile force (N)  Mean of all forces  Mean of max forces  Max of max forces 

1  8.6  2.5  8.1  10.0 
2  15.9  7.5  13.8  25.3 
3  18.3  5.1  13.3  28.1 
In prior work (see Bhattacharjee et al. 2013), this type of Meka robot arm applied upwards of 40 N on a force–torque sensor while moving slowly and using the compliance of the simple impedance controller. In our case, it is clear that dynamic MPC is limiting contact forces to be within the same ranges we saw with quasistatic MPC in Bhattacharjee et al. (2013). In many of the trials on DARCI, there was no way to reach the goal, yet dynamic MPC regulated the forces to be generally low despite the fact that as it is noted in Bhattacharjee et al. (2013) “... the relationship between contact forces and taxel output is complex.”
8.3 Discussion of all local results
The results in this section show that the contact force trends we saw with measurements from only the tactile sensors while reaching in foliage (see Sect. 7) hold even when we use force–torque sensors to obtain ground truth.
9 Applications
An open question is the extent to which our controller can be used for other tasks, robots, and environments. We expect that controllers designed for fast reaching with wholearm contact may eventually serve as general purpose controllers for interaction with people in situations such as inhome assistance or search and rescue.
An anecdotal example of the kind of complex tasks that we envision is shown in the video of Extension 4. Because of the speed of dynamic MPC we can teleoperate DARCI with a Phantom Omni haptic input device to reach into an unmodeled and nonrigid canvas bag with unknown objects inside. The Cartesian position of the haptic input device is measured, scaled and fed to dynamic MPC as a goal position at each time step. The robot is then able to move towards this goal location while locally controlling the contact forces so as to not damage itself or the contents of the bag. The bag is also fixtured at one point to the table with a clamp.
10 Conclusion
We have shown that across a large number of simulated trials, using dynamic MPC while incorporating a linear approximation of the nonlinear dynamics of a robot arm in contact gives significant performance improvements in comparison to quasistatic MPC. In particular, we have shown that for two different settings of clutter and two different force thresholds, dynamic MPC had better success rates, faster speeds and comparable force control. We were also able to run dynamic MPC on the real robot DARCI using the manufacturer’s nominal dynamic parameters (i.e. no complex system identification was necessary). In addition, we have shown that running this controller in Python at around 25 Hz can result in good performance. We would expect that further engineering efforts would allow one to increase the control rate and further improve our force control. A variant of our dynamic MPC has even recently been used successfully in conjunction with haptic mapping, geometric planning, learning and tactile perception (Bhattacharjee et al. 2014).
Our results show the possibility of reaching at faster rates into cluttered environments while controlling velocities, forces, and mitigating the effects of unexpected impact.
Notes
Acknowledgments
We gratefully acknowledge support from DARPA Maximum Mobility and Manipulation (M3) Contract W911NF111603. Support from NSF Emerging Frontiers in Research and Innovation (EFRI) 1137229 allowed the use of the robot DARCI. This work also benefited from discussions with Wayne J. Book, Magnus B. Egerstedt, Lena Ting, Karen Liu, and Jun Ueda. We also thank Tapomayukh Bhattacharjee, Philip Grice, Advait Jain, Daehyung Park, and Joshua Wade for technical discussions on this work and contributions to the forearm tactile sensor and foliage testbed.
Supplementary material
Supplementary material 1 (mp4 105679 KB)
Supplementary material 2 (mp4 14406 KB)
Supplementary material 3 (mp4 12095 KB)
Supplementary material 4 (mp4 10687 KB)
References
 Abbeel, P., Coates, A., & Ng, A. Y. (2010). Autonomous helicopter aerobatics through apprenticeship learning. The International Journal of Robotics Research, 29, 1608–1639.CrossRefGoogle Scholar
 Bellingham, J., Richards, A., & How, J. P. (2002). Receding horizon control of autonomous aerial vehicles. In American Control Conference, 2002. Proceedings of the 2002 (vol. 5, pp. 3741–3746). IEEE.Google Scholar
 Bhattacharjee, T., Grice, P. M., Kapusta, A., Killpack, M. D., Park, D., & Kemp, C. C. (2014). A robotic system for reaching in dense clutter that integrates model predictive control, learning, haptic mapping, and planning. In Proceedings of the 3rd IEEE/RSJ international conference on intelligent robots and systems (IROS) workshop on robots in clutter: perception and interaction in clutter.Google Scholar
 Bhattacharjee, T., Jain, A., Vaish, S., Killpack, M. D., & Kemp, C. C. (2013). 2013 . In IEEE world haptics conference : Tactile sensing over articulated joints with stretchable sensors.Google Scholar
 Bhattacharjee, T., Kapusta, A., Rehg, J., & Kemp, C. C. (2013). Rapid categorization of object properties from incidental contact with a tactile sensing robot arm. In IEEE international conference on humanoid robotics, 2013.Google Scholar
 Brogan, W. L. (1991). Modern control theory. New Jersey: Prentice Hall.zbMATHGoogle Scholar
 Corke, P. I. (1996). A robotics toolbox for MATLAB. IEEE Robotics & Automation Magazine, 3(1), 24–32.CrossRefGoogle Scholar
 Craig, J. (2005). Introduction to robotics: mechanics and control. New Jersey: Prentice Hall.Google Scholar
 De Luca, A., AlbuSchaffer, A., Haddadin, S., & Hirzinger, G. (2006) Collision detection and safe reaction with the dlriii lightweight manipulator arm. In IEEE/RSJ international conference on intelligent robots and systems, 2006 (pp. 1623–1630). IEEE.Google Scholar
 De Luca, A., & Ferrajoli, L. (2008) Exploiting robot redundancy in collision detection and reaction. In: IEEE/RSJ international conference on intelligent robots and systems, 2008. IROS 2008 (pp. 3299–3305). IEEE.Google Scholar
 De Luca, A., & Mattone, R. (2004). An adaptanddetect actuator fdi scheme for robot manipulators. In ICRA.Google Scholar
 Dogar, M., & Srinivasa, S. (2011). A framework for pushgrasping in clutter. Robotics: Science and systems VII. Cambridge: MIT Press.Google Scholar
 Duchaine, V., Bouchard, S., & Gosselin, C. M. (2007). Computationally efficient predictive robot control. IEEE/ASME Transactions on Mechatronics, 12(5), 570–578.CrossRefGoogle Scholar
 Erez, T., Tassa, Y., & Todorov, E. (2012). Infinitehorizon model predictive control for periodic tasks with contacts. Robotics: Science and systems VII (p. 73). Cambridge: MIT Press.Google Scholar
 Erez, T., & Todorov, E. (2012). Trajectory optimization for domains with contacts using inverse dynamics. In IROS.Google Scholar
 Featherstone, R., & Orin, D. E. (2008). Dynamics. Handbook of robotics (Chap. 2). Berlin: Springer.Google Scholar
 From, P. J., Gravdahl, J. T., Lillehagen, T., & Abbeel, P. (2011). Motion planning and control of robotic manipulators on seaborne platforms. Control Engineering Practice, 19(8), 809–819.CrossRefGoogle Scholar
 Garcia, C., Prett, D., & Morari, M. (1989). Model predictive control: Theory and practice–A survey. Automatica, 25(3), 335–348.CrossRefzbMATHGoogle Scholar
 Grizzle, J. W., Abba, G., & Plestan, F. (2001). Asymptotically stable walking for biped robots: Analysis via systems with impulse effects. IEEE Transactions on Automatic Control, 46(1), 51–64.CrossRefMathSciNetzbMATHGoogle Scholar
 Guo, Z., & Hsia, T. (1993). Joint trajectory generation for redundant robots in an environment with obstacles. Journal of Robotic Systems, 10(2), 199–215.CrossRefzbMATHGoogle Scholar
 Haddadin, S., AlbuSchaffer, A., De Luca, A., & Hirzinger, G. (2008). Collision detection and reaction: A contribution to safe physical human–robot interaction. In IEEE/RSJ international conference on intelligent robots and systems, 2008. IROS 2008 (pp. 3356–3363). IEEE.Google Scholar
 Haddadin, S., AlbuSchäffer, A., & Hirzinger, G. (2011). Safe physical human–robot interaction: Measurements, analysis and new insights. Robotics research (pp. 395–407). Berlin: Springer.Google Scholar
 Haddadin, S., Haddadin, S., Khoury, A., Rokahr, T., Parusel, S., Burgkart, R., Bicchi, A., & AlbuSchaffer, A. (2012). A truly safely moving robot has to know what injury it may cause. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2012 (pp. 5406–5413). IEEE.Google Scholar
 Hogan, N. (1985). Impedance controlan approach to manipulation. ITheory. IIImplementation. IIIApplications. ASME Transactions Journal of Dynamic Systems and Measurement Control, 107, 1–24.CrossRefzbMATHGoogle Scholar
 Hogan, N., & Buerger, S. (2005). Impedance and interaction control (Chap. 9). Robotics and automation handbook. Boca Raton: CRCPress.Google Scholar
 Hornung, A., Phillips, M., Jones, E. G., Bennewitz, M., Likhachev, M., & Chitta, S. (2012) Navigation in threedimensional cluttered environments for mobile manipulation. In IEEE international conference on robotics and automation (ICRA), 2012 (pp. 423–429). IEEE.Google Scholar
 Ivaldi, S., Fumagalli, M., Nori, F., Baglietto, M., Metta, G., & Sandini, G. (2010). Approximate optimal control for reaching and trajectory planning in a humanoid robot. In IEEE/RSJ international Conference on intelligent robots and systems (IROS), 2010 (pp. 1290–1296). IEEE.Google Scholar
 Jain, A., Killpack, M. D., Edsinger, A., & Kemp, C. (2013). Reaching in clutter with wholearm tactile sensing. The International Journal of Robotics Research, 32, 458–482.CrossRefGoogle Scholar
 Katz, D., Venkatraman, A., Kazemi, M., Bagnell, J. A., & Stentz, A. (2013). Perceiving, learning, and exploiting object affordances for autonomous pile manipulation. Autonomous Robots, 37, 369–382.CrossRefGoogle Scholar
 Kavraki, L. E., & la Valle, S. M. (2008). Motion planning. In B. Siciliano & O. Khatib (Eds.), Handbook of robotics. New York: Springer.Google Scholar
 Killpack, M. D. (2013). Model predictive control with haptic feedback for robot manipulation in cluttered scenarios. Dissertation, Institute of Technology, Georgia. http://hdl.handle.net/1853/50207.
 Killpack, M. D., & Kemp, C. C. (2013). Fast reaching in clutter while regulating forces using model predictive control. In IEEERAS international conference on humanoid robots (humanoids).Google Scholar
 Kulchenko, P., & Todorov, E. (2011). Firstexit model predictive control of fast discontinuous dynamics: Application to ball bouncing. In IEEE international conference on robotics and automation (ICRA), 2011 (pp. 2144–2151). IEEE.Google Scholar
 Latombe, J. C. (1990). Robot motion planning. Berlin: Springer.Google Scholar
 Leeper, A., Hsiao, K., Ciocarlie, M., Sucan, I., & Salisbury, K. (2013). Arm teleoperation in clutter using virtual constraints from real sensor data. In RSS workshop on robots in clutter: preparing robots for the real world.Google Scholar
 Leeper, A., Hsiao, K., Ciocarlie, M., Sucan, I., & Salisbury, K. (2013). Methods for collisionfree arm teleoperation in clutter using constraints from 3d sensor data. Atlanta, GA: In IEEE international conference on humanoid robots.CrossRefGoogle Scholar
 Manchester, I. R., Mettin, U., Iida, F., & Tedrake, R. (2011). Stable dynamic walking over uneven terrain. The International Journal of Robotics Research, 30, 3.CrossRefGoogle Scholar
 Mason, M. (2001). Mechanics of robotic manipulation. London: MIT Press.Google Scholar
 Mattingley, J., & Boyd, S. (2009). Automatic code generation for realtime convex optimization. Convex optimization in signal processing and communications. Cambridge: Cambridge University Press.Google Scholar
 Mattingley, J., & Boyd, S. (2010). Realtime convex optimization in signal processing. IEEE Signal Processing Magazine, 27(3), 50–61.CrossRefGoogle Scholar
 Mattingley, J., & Boyd, S. (2012). Cvxgen: a code generator for embedded convex optimization. Optimization and Engineering, 13, 1–27.CrossRefMathSciNetzbMATHGoogle Scholar
 McClave, J., Benson, P., & Sincich, T. (2008). Statistics for business and economics. London: Pearson Education.Google Scholar
 Moler, C., & Van Loan, C. (2003). Nineteen dubious ways to compute the exponential of a matrix, twentyfive years later. SIAM Review, 45(1), 3–49.CrossRefMathSciNetzbMATHGoogle Scholar
 Mordatch, I., Popovic, Z., & Todorov, E. (2012). Contactinvariant optimization for hand manipulation. In Eurographics/ACM SIGGRAPH symposium on computer animation (pp. 137–144). The Eurographics Association.Google Scholar
 Mordatch, I., Todorov, E., & Popović, Z. (2012). Discovery of complex behaviors through contactinvariant optimization. ACM Transactions on Graphics (TOG), 31(4), 43.CrossRefGoogle Scholar
 Phan, S., Quek, Z., Shah, P., Shin, D., Ahmed, Z., Khatib, O., & Cutkosky, M. (2011). Capacitive skin sensors for robot impact monitoring. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2992–2997). IEEE.Google Scholar
 Posa, M., & Tedrake, R. (2013). Direct trajectory optimization of rigid body dynamical systems through contact. Algorithmic foundations of robotics X (pp. 527–542). Berlin: Springer.Google Scholar
 Rossiter, J. (2003). Modelbased predictive control: a practical approach. Boca Raton: CRC.Google Scholar
 Saxena, A., Driemeyer, J., & Ng, A. (2008). Robotic grasping of novel objects using vision. The International Journal of Robotics Research, 27(2), 157.CrossRefGoogle Scholar
 Saxena, A., Wong, L., Quigley, M., & Ng, A. Y. (2011). A visionbased system for grasping novel objects in cluttered environments. Robotics research (pp. 337–348). Berlin: Springer.Google Scholar
 Shin, D., Quek, Z., Phan, S., Cutkosky, M., & Khatib, O. (2011). Instantaneous stiffness effects on impact forces in humanfriendly robots. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2998–3003). IEEE.Google Scholar
 Siciliano, B., Sciavicco, L., Villani, L., & Oriolo, G. (2011). Robotics: modelling, planning and control. Berlin: Springer.Google Scholar
 Smith, R., et al. (2011). Open dynamics engine. http://www.ode.org.
 Sousa, C. A. D. (2014) SymPyBotics v1.0. doi: 10.5281/zenodo.11365. http://zenodo.org/record/11365.
 Srinivas, S., Ferguson, D., Helfrich, C., Berenson, D., Collet, A., Diankov, R., et al. (2009). Herb: a home exploring robotic butler. Autonomous Robots, 28, 5–20.CrossRefGoogle Scholar
 Stewart, D., & Trinkle, J. (2000). An implicit timestepping scheme for rigid body dynamics with coulomb friction. In IEEE international conference on robotics and automation, 2000. Proceedings ICRA’00 (vol. 1, pp. 162–169). IEEE.Google Scholar
 Stilman, M. (2010). Global manipulation planning in robot joint space with task constraints. IEEE Transactions on Robotics, 26(3), 576–584.CrossRefGoogle Scholar
 Stilman, M., Schamburek, J., Kuffner, J., & Asfour, T. (2007). Manipulation planning among movable obstacles. In Proceedings of IEEE international conference on robotics and automation (pp. 3327–3332). Citeseer.Google Scholar
 Wieber, P. B. (2006). Trajectory free linear model predictive control for stable walking in the presence of strong perturbations. In 6th IEEERAS international conference on humanoid robots, 2006 (pp. 137–142). IEEE.Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.