Social Affordance Grammar

Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions

Abstract

In this paper, we present a general framework for learning social affordance grammar as a spatiotemporal AND-OR graph (ST-AOG) from RGB-D videos of human interactions, and transfer the grammar to humanoids to enable a real-time motion inference for human-robot interaction (HRI). Based on Gibbs sampling, our weakly supervised grammar learning can automatically construct a hierarchical representation of an interaction with long-term joint sub-tasks of both agents and short term atomic actions of individual agents. Based on a new RGB-D video dataset with rich instances of human interactions, our experiments of Baxter simulation, human evaluation, and real Baxter test demonstrate that the model learned from limited training data successfully generates human-like behaviors in unseen scenarios and outperforms both baselines.

Robots taught to work alongside humans by giving high fives, April 27, 2017.

Paper and Demo

Paper

Tianmin Shu, Xiaofeng Gao, Michael S. Ryoo and Song-Chun Zhu. Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions. IEEE International Conference on Robotics and Automation (ICRA), 2017. [PDF] [slides]

@inproceedings{ShuICRA17,
  title     = {Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions},
  author    = {Tianmin Shu and Xiaofeng Gao and Michael S. Ryoo and Song-Chun Zhu},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA),
  year      = {2017}}
}

UCLA Human-Human-Object Interaction (HHOI) Dataset V2

Download

Skeleton + Annotation (118.9 MB)

The dataset is available for free to researchers from academic institutions (e.g., universities, government research labs, etc.) for non-commercial purposes.

We greatly appreciate emails about bugs or suggestions.

Please cite this paper if you use the dataset:

Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions

Tianmin Shu¹, Xiaofeng Gao², Michael S. Ryoo³ and Song-Chun Zhu¹

Abstract

Paper and Demo

Paper

UCLA Human-Human-Object Interaction (HHOI) Dataset V2

Download

Contact

Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions

Tianmin Shu1, Xiaofeng Gao2, Michael S. Ryoo3 and Song-Chun Zhu1

Abstract

Paper and Demo

Paper

UCLA Human-Human-Object Interaction (HHOI) Dataset V2

Download

Contact

Tianmin Shu¹, Xiaofeng Gao², Michael S. Ryoo³ and Song-Chun Zhu¹