Dynamics Learning with Cascaded Variational
Inference for Multi-Step Manipulation

Kuan Fang

  Abstract
    The fundamental challenge of planning for multi-step manipulation is to find
    effective and plausible action sequences that lead to the task goal. We
    present Cascaded Variational Inference (CAVIN) Planner, a model-based method
    that hierarchically generates plans by sampling from latent spaces. To
    facilitate planning over long time horizons, our method learns latent
    representations that decouple the prediction of high-level effects from the
    generation of low-level motions through cascaded variational inference. This
    enables us to model dynamics at two different levels of temporal resolutions
    for hierarchical planning. We evaluate our approach in three multi-step
    robotic manipulation tasks in cluttered tabletop environments given
    high-dimensional observations. Empirical results demonstrate that the proposed
    method outperforms state-of-the-art model-based methods by strategically
    interacting with multiple objects.
  

Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation
Kuan Fang, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei
Conference on Robot Learning (CoRL), 2019

[PDF] · [BibTex]

Tasks

Execution

We observe that the robot comes up with diverse strategies in different task scenarios.

Open Path: When the target object is surrounded by obstacle objects, the robot opens a path for the target object towards the goal without entering the restricted area (red tiles).

Get Around: In presence of a pile of obstacle objects between the target and the goal, the robot pushes the target around.

Squeeze Through: When there is a small gap between a bunch of objects, the robot squeezes the target object through the gap.

Move Away Obstacles: When pushing the target object across the bridge (grey tiles), the robot clears obstacle objects one by one along the way.

Push Target Through Obstacles: When the robot cannot directly reach the target object, it squeezes the target object by pushing obstacle objects.

Clean up a workspace: Clean up a workspace: The robot moves objects out of a designated workspace (blue tiles).

Variations

Various layouts and objects for in each task in simulation and the real world.

Code

We’ve released our codebase and the task environments in simulation and the real world.

Dynamics Learning with Cascaded VariationalInference for Multi-Step Manipulation

Abstract

Tasks

Execution

Variations

Code

Dynamics Learning with Cascaded Variational
Inference for Multi-Step Manipulation