Parareal with a Learned Coarse Model for Robotic Manipulation

A key component of many robotics model-based planning and control algorithms is physics predictions, that is, forecasting a sequence of states given an initial state and a sequence of controls. This process is slow and a major computational bottleneck for robotics planning algorithms. Parallel-in-time integration methods can help to leverage parallel computing to accelerate physics predictions and thus planning. The Parareal algorithm iterates between a coarse serial integrator and a fine parallel integrator. A key challenge is to devise a coarse level model that is computationally cheap but accurate enough for Parareal to converge quickly. Here, we investigate the use of a deep neural network physics model as a coarse model for Parareal in the context of robotic manipulation. In simulated experiments using the physics engine Mujoco as fine propagator we show that the learned coarse model leads to faster Parareal convergence than a coarse physics-based model. We further show that the learned coarse model allows to apply Parareal to scenarios with multiple objects, where the physics-based coarse model is not applicable. Finally, We conduct experiments on a real robot and show that Parareal predictions are close to real-world physics predictions for robotic pushing of multiple objects. Some real robot manipulation plans using Parareal can be found at https://www.youtube.com/watch?v=wCh2o1rf-gA .


Introduction
We present a method for fast and accurate physics predictions during non-prehensile manipulation planning and control. An example scenario is shown in Figure 1, where a robot arm pushes the marked cylindrical object into a target zone without pushing the other three objects off the table. We are interested in predicting the motion of the objects in a fast and accurate way. Physics engines like Mujoco solve Newton's equation to predict motion. They are accurate but slow. Coarse models can be build by introducing simplifying assumptions, trading accuracy for solution speed but their lack of precision will eventually compromise the robot's chance of completing a given task successfully.
Given an initial state and a sequence of controls, the problem of predicting the resulting sequence of states is a key component of a variety of model-based planning and control algorithms [10,12,11,25,5,9,2,22,14]. Mathematically, such a prediction requires solving an initial value problem. Typically, those are solved through numerical integration over time-steps using e.g. Euler's method or Runge-Kutta methods and an underlying physics model to provide the forces. However, the speed with which these accurate physics-based predictions can be performed is still slow [6] and faster physics-based predictions can contribute significantly to contact-based/non-prehensile manipulation planning and control.
In a previous paper [3], we demonstrated that predictions for a robot pushing a single object can be made faster by combining a fine physics-based model with a simple, coarse physics-based model using the parallelin-time method Parareal. Using 4 cores, Parareal was about a factor two faster than the fine physics engine alone while providing comparable accuracy and the same success rate for a push planning with obstacle avoidance. Here, we extend these results by investigating a learned deep neural network as coarse model and show that it leads to faster Parareal convergence. We also demonstrate that Parareal can be used to speed up physics prediction in scenarios where the robot pushes multiple objects.

Related Work
Parareal has been used in many different areas. Trindade et al., for example, use it to simulate incompressible laminar flows [24]. Maday et al. have tested it for to simulate dynamics in quantum chemistry [16]. The method was introduced by Lions et al. in 2001 [15]. Combinations of parallel-in-time integration and neural networks have not yet been studied widely. Very recently, Yalla and Enquist showed the promise of using a machine learned model as coarse propagator [26] for test problems. Going the other way, Schroder [21] and Gnther et al. [20] recently showed that parallel-in-time integration can be used to speed up the process of training neural networks. Parareal's potential to speed up planning simulations for robotic manipulation in singleobject scenarios using a physics-based coarse model was recently demonstrated by Agboh et al. [3].
Combining different physics models for robotic manipulation has been the topic of recent research, although not with a focus on improving prediction speed. Kloss et al. [13] address the question of accuracy and generalization in combined neural-analytical models. Ajay et al. [4] focus on modeling of the inherent stochastic nature of the real world physics, by combining an analytical, deterministic rigid-body simulator with a stochastic neural network.
We can make physics engines faster by using larger simulation time steps, however this decreases the accuracy and can result in unstable behavior. To generate stable behaviour at large time-step sizes, Pan et al. [18] propose an integrator for articulated body dynamics by using only position variables to formulate the dynamic equation. Moreover, Fan et al. [7] propose linear-time variational integrators of arbitrarily high order for robotic simulation and use them in trajectory optimization to complete robotics tasks. Recent work have focused on making the underlying planning and control algorithms faster. For example, Giftthaler et al. [8] introduced a multiple-shooting variant of the trajectory optimizer -iterative linear quadratic regulator (ilqr) which has shown impressive results for real-time nonlinear optimal control of complex robotic systems [17,19].

Robotic manipulation
Consider the scene shown in Figure 1. The robot's manipulation task is to control the motion of the green goal object through pushing contact from the cylindrical pusher in the robot's gripper. The robot needs to push the goal object into a goal region marked with an X. It is allowed to make contact with other sliders but not push them off the table or into the goal region.
The system's state at time point n consists of the pose q and velocity 9 q of the pusher P and N s sliders, S i . . . S Ns : x n " rq P n , q S i n , . . . , q S Ns n , 9 q P n , 9 q S i n , . . . , 9 q S Ns n s The pose of slider i consists of its position and orientation on the plane: q S i " rq S i x , q S i y , q S i θ s T . The pusher's pose is: q P " rq Px , q Py s T and control inputs are velocities u n " ru x n , u y n s T applied on the pusher at time n for a control duration of ∆t.
A robotics planning and control algorithm takes in an initial state of the system x 0 , and outputs an optimal sequence of controls tu 0 , u 1 , . . . , u N´1 u. However, to generate this optimal sequence, the planner needs to simulate many different control sequences and predict many resulting sequences of states as tx 1 , x 2 , . . . , x N u.
The planner makes these simulations through a physics model F of the real-world that predicts the next state x n`1 given the current state x n and a control input u n x n`1 " F px n , u n , ∆tq. (1) We use the general physics engine Mujoco [23] to model the system dynamics. It solves Newton's equations of motion for the complex multi-contact dynamics problem.

Parareal
Normally, computing all states x n happens in a serial fashion, by evaluating (1) first for n " 0, then for n " 1, Fig. 1: Example of a robotic manipulation planning and control task using physics predictions. The robot controls the motion of the green object solely through contact. The goal is to push the green object into the target region marked X. The robot must complete the task without pushing other objects off the table or into the goal region.
etc. Parareal replaces this inherently serial procedure by a parallel-in-time integration process where some of the work can be done in parallel. For Parareal, we need a coarse physics model It needs to be computationally cheap relative to the fine model but does not have to be very accurate. Parareal begins by computing an initial guess x k"0 n of the state at each time point n of the trajectory using the coarse model.
This guess is then corrected via the Parreal iteration for all timesteps n " 0, . . . , N´1. The newly introduced superscript k counts the number of Parareal iterations.
The key point in iteration (3) is that evaluating the fine physics model can be done in parallel for all n " 0, . . . , N´1, while only the fast coarse model has to be computed serially. After one Parareal iteration, x 1 1 is exactly the fine solution. After two iterations, x 1 1 and x 2 2 are exactly the fine solutions. When k " N , Parareal produces the exact fine solution. However, to produce speed up, we need to stop Parareal at much earlier iterations. This way, Parareal can run in less wall-clock time than running the fine model serially step-by-step. Below, we demonstrate that even after a small number of iterations, the solution produced by Parareal is of sufficient quality to allow our robot to succeed with different tasks. Note that, for the sake of simplicity, we assume here that the number of controls N and the number of processors used to parallelize in time are identical, but this can easily be generalised.

Coarse models
In this section, we introduce two coarse physics models for Parareal -a learned coarse model and the analytical coarse model from Agboh et al. [3].

Learned coarse model
As an alternative to the coarse physics model, we train a deep neural network as a coarse model for Parareal for robotic pushing.

Network architecture
The input to our neural network model is a state x n and a single action u n . The output is a single next state x n`1 . We use a feed-forward deep neural network (DNN) with 5 fully connected layers. The first 4 contain 512, 256, 128 and 64 neurons, respectively, with ReLU activation function. The output layer contains 24 neurons with linear activation functions.

Dataset
We collect training data using the physics engine Mujoco [23]. Each training sample is a tuple (x n , u n , x n`1 ). It contains a randomly sampled initial state, action, and next state. We collect over 2 million such samples from the physics simulator.
During robotic pushing, a physics model may need to predict the resulting state even for cases when there is no contact between pusher and slider. We include both contact and no-contact cases in the training data.
We train a single neural network to handle one pusher with at least one and at most N s objects being pushed (also called sliders). While collecting data for a particular number of sliders, we placed the unused sliders in distinct fixed positions outside the pushing workspace. These exact positions must be passed to the neural network at test time if fewer than N s sliders are active. For  example, if N s " 4, to make a prediction for a 3 slider scene, we place the last slider at the same fixed position used during training.

Loss function
The standard loss function for training is the mean squared error between the network's prediction and the training data. On its own, this leads to infeasible state predictions where there is pusher-slider or slider-slider penetration. We resolve this by adding a no penetration loss term such that the final loss function reads Here, W F is a constant weight, B is the batch size, V is number of samples per batch, x f ij is the next state predicted by the fine model, x N N ij is the next state predicted by the DNN model. p i and p j are positions of sliders i and j respectively, r p is the radius of the pusher, and r i , r j represent the radius of sliders i and j respectively. The first line of Equation 4 is the standard mean squared error. The second line penalizes pusherslider penetration and the third line penalizes sliderslider penetration.
Finally, the network makes a single step prediction. However, robotic manipulation typically needs a multistep prediction as a result of a control sequence. To do this, we start from the initial state and apply the first action in the sequence to get a resulting next state. Then, we use this next state as a new input to the network together with the second action in the sequence and so on. This way, we repeatedly query the network with its previous predictions as the current state input.

Analytical coarse model
Agboh et al. [3] have proposed a simple, kinematic coarse physics model for pushing a single object. The model moves the slider with the same linear velocity as the pusher as long as there is contact between the two. We give details below for completeness: q S n`1 " q S n`r u x n , u y n , ωs T¨p c¨∆ t (5) 9 q S n`1 " tru x n , u y n , ωs T if p c ą 0, 9 q S n otherwiseu (7) q P n`1 " q P n`un¨∆ t, 9 q P n`1 " u n .
Here, p c is the ratio of contact distance d contact travelled by the pusher when in contact with the slider and the total pushing distance, r c is a vector from the contact point to the object's center at the current state q S n , θ is the angle between the pushing direction and the vector r c , ω is the coarse angular velocity induced by the pusher on the slider. K ω is a positive constant.

Planning and control
We use the predictive model based on Parareal described above in a planning and control framework for pushing an object on a table to a target location. We take an optimization approach to solve this problem. Given the table geometry, goal position, the current state of the pusher and all sliders x 0 , and an initial candidate sequence of controls tu 0 , u 1 , . . . , u N´1 u, the optimization procedure outputs an optimal sequence tu0 , u1 , . . . , uN´1u according to some defined cost.
The predictive model is used within this optimizer to roll-out a sequence of controls to predict the states tx 1 , . . . , x N u. These are then used to compute the cost associated with those controls. The details of the exact trajectory optimizer can be found elsewhere in Agboh et al. [1]. The cost function we use penalizes moving obstacle sliders and dropping objects from the table but encourages getting the goal object into the goal location.
We use the trajectory optimizer in a model-predictive control (MPC) framework. Once we get an output control sequence from the optimizer, we do not execute the whole sequence on the real-robot serially one after the other. Instead, we execute only the first action, update x 0 with the observed state of the system, and repeat the optimization to generate a new control sequence. We repeat this process until the task is complete.
Such an optimization-based MPC approach to pushing manipulation is frequently used to handle uncertainty and improve success in the real-world [5,10,13,2]. Here, our focus is to evaluate the performance of Parareal with learned coarse model for planning and control.

Experiments and Results
In our experiments, we investigate three key issues. First, we investigate how fast Parareal converges to the fine solution for robotic pushing tasks with different coarse models. Second, we investigate the physics prediction accuracy of Parareal with respect to real-world pushing data. Finally, we demonstrate that the Parareal physics model can be used to complete real-robot manipulation tasks.
In Subsection 6.1 we provide preliminary information used throughout the experiments. Subsection 6.2 investigates convergence of Parareal for two different coarse models -the analytical coarse model for single object pushing and a learned coarse model for both single and multiple object pushing. In Subsection 6.3 we present results from real-robot experiments. First, we compare the accuracy of Parareal predictions against real-world pushing physics. Then, we show several realrobot plan executions using Parareal with a learned coarse physics model as predictive model.

Preliminaries
In all experiments, we run Mujoco at the largest possible time-step (1ms) beyond which the simulator becomes unstable. All computations run on a standard Laptop PC with an Intel(R) Core (TM) i7-4712HQ CPU @2.3GHz with N " 4 cores. Our control sequences consist of four actions, each applied for a control duration ∆ t " 1s.
Our real robot setup is shown in Figure 1. We have a Robotiq two-finger gripper holding the cylindrical pusher. We place markers on the pusher and sliders to sense their full pose in the environment with an Opti-Track motion capture system.

Parareal convergence
In this section we investigate how fast Parareal converges using two coarse models -the analytic model for single object pushing and the learned model for both single object and multi-object pushing. At each iteration, we compute a root mean square (RMS) error between Parareals predictions and the fine model's predictions of the corresponding sequence of states.

Single object pushing
We randomly sample an initial state for the pusher and slider. We also randomly sample a control sequence where the pusher contacts the slider at least once during execution. Thereafter, we execute the control sequence starting from the initial state using Parareal. For the sample state and control sequence, we perform to runs, one using the learned model and the other using the physics model as coarse propagator in Parareal. The analytical model makes a single step prediction 170 times faster than the fine model on average, while the learned model is 130 times faster on average. While technically this means a tighter bound on speedup for the learned model (see Section II.A in Agboh et al. [3]), both models are so fast that our actual speedup is almost completely governed by the number of iterations. Therefore, the slightly higher cost of the learned versus 4-slider Parareal prediction 2-slider Parareal prediction Fig. 3: Root mean square error (in log scale) along the full trajectory per slider in a 4-slider pushing experiment (left) using only the learned model. We find that the learned model enables Parareal convergence for the multiobject case. Two sample motions are illustrated (center and right) for multi-object physics prediction. the phyiscs coarse model does not have a significant impact on Parareal's performance.
We collect 100 state and control sequence samples and compute the RMS between Parareal and the fine model run in serial. The results are shown in Figure 2 (left). We see that the learned model leads to faster convergence of Parareal than the analytical model for single object pushing,. This is because the learned model is more accurate. For example, the single-step prediction of the learned model as shown in Figure 2 (right) red is much closer to the fine prediction shown in green than analytical model shown in Figure 2 (center).

Multi-object pushing
We randomly sample a valid initial state for the pusher and multiple sliders. Then, similar to the single object pushing case, we also sample a random control sequence that makes contact with at least one slider. We then predict the corresponding sequence of states using Parareal. However, for multi-object pushing we use only the learned model as the coarse physics model within Parareal. The analytical model for single-object pushing would need significant modifications to work for the multi-object case. Again, we collect 100 state and control sequence samples and run Parareal for each of them. Our results are shown in Figure 3. Figure 3(left) shows the RMS error per slider for each Parareal iteration. While there are differences in the accuracy of the predictions for different slides, all errors decrease and Parareal converges at a reasonable pace. Some sample predictions are shown for a 4 slider environment in Figure 3(center), and for a 2-slider environment in Figure 3(right). In both scenes, the pusher moves forward making contact with multiple sliders and Parareal is able to predict how the state evolves.

Real robot experiments
In this section we investigate the physics prediction accuracy of Parareal with respect to real-world pushing physics. We do this for the multi-object case. In addition, we show real-world demonstrations for robotic manipulation where we use Parareal for physics prediction.

Parareal prediction vs. real-world physics
Our coarse model neural network was trained using simulated data. Here, we demonstrate that Parareal using the trained coarse model is also able to predict realworld states. We randomly set an initial state in a realworld example by selecting positions for the pusher and sliders. This state is recorded using our motion capture system. Next, we sample a control sequence and let the real robot execute it. Again, we record the corresponding sequence of states using motion capture. Then, for the recorded initial state and control sequence pair, we use Parareal to produce the corresponding sequence of states and compare the result against the states measured for the real robot with optical tracking. Figure 5 shows the RMS error between Parareal's prediction at different iteration numbers and the realworld pushing data. Vertical red bars indicate 95% confidence intervals.
Parareal's real-world error decreases with increasing iteration numbers and it is eventually twice as accurate as the coarse model. These results indicate that

Planning and control
We use the Parareal predictive model for robotic manipulation to generate plans faster than using the fine model directly. In this section, we complete 3 real robot executions with Parareal at 1 iteration. We use the learned model as the coarse model in all cases.
As can be seen in Figure 6, the robot's task is to push the green slider into the target region marked with X. The robot is allowed to make contact with other sliders but not push them off the table or into the goal region.
The robot was successful for all 3 sample scenes. Some sample plans for two scenes are shown in Figure 6. The third scene is shown in Figure 1. We find that using Parareal with a learned coarse model for physics predictions, a robot can successfully complete complex real-world pushing manipulation tasks involving multiple objects. At 1 Parareal iteration, we complete the tasks about 4 times faster than directly using the fine model.

Summary
We demonstrate the promise of using Parareal to parallelize the predictive model in a robot manipulation task involving multiple objects. As coarse model, we propose a neural network, trained with a physics simulator. We show that for single object pushing, Parareal converges faster with the learned model than with a coarse physics-based model we introduced in earlier work. Furthermore, we show that Parareal with the learned model as coarse propagator can successfully complete tasks that involve pushing multiple objects. We also show that although a simulator is used to provide training data, Parareal with a learned coarse model can accuractly predict experiments that involve pushing with a real robot. Fig. 5: The resulting sequence of states for applying a random control sequence starting from some random initial state in the real-world. Our goal is to assess the accuracy of the Parareal physics models with respect to real-world physics. We collect 50 such samples. These are some snapshots for 3 of such scenes -one per row with initial state on the left and final state on the right.