Dynamics Learning with Cascaded Variational
Inference for Multi-Step Manipulation


The fundamental challenge of planning for multi-step manipulation is to find effective and plausible action sequences that lead to the task goal. We present Cascaded Variational Inference (CAVIN) Planner, a model-based method that hierarchically generates plans by sampling from latent spaces. To facilitate planning over long time horizons, our method learns latent representations that decouple the prediction of high-level effects from the generation of low-level motions through cascaded variational inference. This enables us to model dynamics at two different levels of temporal resolutions for hierarchical planning. We evaluate our approach in three multi-step robotic manipulation tasks in cluttered tabletop environments given high-dimensional observations. Empirical results demonstrate that the proposed method outperforms state-of-the-art model-based methods by strategically interacting with multiple objects.
Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation
Kuan Fang, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei
Conference on Robot Learning (CoRL), 2019 [PDF]  ·  [BibTex]



We observe that the robot comes up with diverse strategies in different task scenarios.

Open Path: When the target object is surrounded by obstacles, the robot opens a path for the target object towards the goal.

Get Around: In presence of a tile of objects between the target and the goal, the robot pushes the target around.

Squeeze Through: When there is a small gap, the robot squeezes the target object through the gap.

Move Away Obstacles: The robot clears obstacles one by one along the way of the target object.

Push Target Through Obstacles: When the robot cannot directly reach the target object, it squeezes the target object by pushing obstacles.


Various layouts and objects for in each task in simulation and the real world.


Coming soon.