Dynamics Learning with Cascaded Variational
Inference for Multi-Step Manipulation

Abstract

The fundamental challenge of planning for multi-step manipulation is to find effective and plausible action sequences that lead to the task goal. We present Cascaded Variational Inference (CAVIN) Planner, a model-based method that hierarchically generates plans by sampling from latent spaces. To facilitate planning over long time horizons, our method learns latent representations that decouple the prediction of high-level effects from the generation of low-level motions through cascaded variational inference. This enables us to model dynamics at two different levels of temporal resolutions for hierarchical planning. We evaluate our approach in three multi-step robotic manipulation tasks in cluttered tabletop environments given high-dimensional observations. Empirical results demonstrate that the proposed method outperforms state-of-the-art model-based methods by strategically interacting with multiple objects.

Presented at CoRL 2019 (Oral Presentation)

[PDF]  ·  [BibTex]

Tasks

Execution

We observe that the robot comes up with diverse strategies in different task scenarios.

Open Path: When the target object is surrounded by obstacles, the robot opens a path for the target object towards the goal.

Get Around: In presence of a tile of objects between the target and the goal, the robot pushes the target around.

Squeeze Through: When there is a small gap, the robot squeezes the target object through the gap.

Move Away Obstacles: The robot clears obstacles one by one along the way of the target object.

Push Target Through Obstacles: When the robot cannot directly reach the target object, it squeezes the target object by pushing obstacles.

Variations

Various layouts and objects for in each task in simulation and the real world.

Code

Coming soon.