Human-Robot-Scene Interaction and Collaboration

Introduction

Intelligent robots are advancing rapidly, with embodied agents increasingly expected to work and live alongside humans in households, factories, hospitals, schools, etc. For these agents to operate safely, socially, and intelligently, they must effectively interact with humans and adapt to changing environments. Moreover, such interactions can transform human behavior and even reshape the environment—for example, through adjustments in human motion during robot-assisted handovers or the redesign of objects for improved robotic grasping. Beyond established research in human-human and human-scene interactions, vast opportunities remain in exploring human-robot-scene collaboration. This workshop will explore the integration of embodied agents into dynamic human-robot-scene interactions. Our focus is on, but not limited to:

Transferring knowledge from human-human and human-scene interaction and collaboration to inform the development of humanoids and other embodied agents (e.g., via retargeting).
Exploring different methods for deriving visual representations that capture object properties, dynamics, and affordances relevant to human-robot collaboration.
Investigating methods for modeling and predicting human intentions to enable robots to anticipate actions and respond safely.
Integrating robots into interactive settings to foster seamless and effective teamwork.
Establishing meaningful benchmarks and metrics to measure advancements in human-robot-scene interaction and collaboration.

Papers

The accepted papers can be found at OpenReview.

Oral

Poster

Challenges

We are excited to announce the Multi-Terrain Humanoid Locomotion Challenge and Humanoid-Object Interaction Challenge, which will be held in conjunction with the workshop. The challenge aims to foster advancements in humanoid-scene interaction by providing a platform for researchers to showcase their work on embodied agents in dynamic environments. For more details, please visit the challenge websites.

Multi-Terrain Humanoid Locomotion Challenge

🏆 Awards: 🥇 First Prize ($1000) 🥈 Second Prize ($500) 🥉 Third Prize ($300)

Humanoid-Object Interaction Challenge

🏆 Awards: 🥇 First Prize ($1000) 🥈 Second Prize ($500) 🥉 Third Prize ($300)

Schedule

Time	Activity	Details
13:30 - 13:40	Welcome & Introduction	Host: Jingya Wang
13:40 - 14:10	Invited talk 1: Visual Embodied Planning	Speaker: Roozbeh Mottaghi
14:10 - 14:40	Invited talk 2: Perceiving Humans and Interactions at Affordable Cost Abstract: Understanding human behaviours requires information from not only the humans themselves, but also holistic information of the surrounding environment. Embedding such perception ability into robots with affordance cost is important to allow embodied AI to help every household. In this talk, I will discuss our recent works to perceive humans and their interactions with the environment in a cost-effective way. For dynamic human object interactions, we propose procedural interaction generation which allows scaling up interaction data for training interaction reconstruction models that generalizes to in the wild images and videos captured by mobile phones. I will then discuss our method PhySIC, an efficient optimization approach that reconstruct human, scene, and importantly physically plausible contacts from single image. I will also present Human3R which reconstructs everyone everywhere at 15fps with single GPU.	Speaker: Xianghui Xie
14:40 - 15:10	Oral Presentations	PICO: Reconstructing 3D People In Contact with Objects DialNav: Multi-turn Dialog Navigation with a Remote Guide Integrating LMM Planners and 3D Skill Policies for Generalizable Manipulation RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations
15:10 - 15:40	Coffee Break & Poster Session
15:40 - 16:10	Invited talk 3: Memory as a model of the world	Speaker: Mahi Shafiullah
16:10 - 16:40	Invited talk 4	Speaker: Hang Zhao
16:40 - 17:10	Invited talk 5: Planning and Inverse Planning with Neuro-Symbolic Concepts for Human-Robot-Scene Interaction	Speaker: Jiayuan Mao
17:10 - 17:30	Challenge award ceremony and Concluding remarks	Host: Yuexin Ma