Robust learning from observation with model misspecification

Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Craig Innes, Subramanian Ramamoorthy, Adrian Weller

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Imitation learning (IL) is a popular paradigm for training policies in robotic systems when specifying the reward function is difficult. However, despite the success of IL algorithms, they impose the somewhat unrealistic requirement that the expert demonstrations must come from the same domain in which a new imitator policy is to be learned. We consider a practical setting, where (i) state-only expert demonstrations from the real (deployment) environment are given to the learner, (ii) the imitation learner policy is trained in a simulation (training) environment whose transition dynamics is slightly different from the real environment, and (iii) the learner does not have any access to the real environment during the training phase beyond the batch of demonstrations given. Most of the current IL methods, such as generative adversarial imitation learning and its state-only variants, fail to imitate the optimal expert behavior under the above setting. By leveraging insights from the Robust reinforcement learning (RL) literature and building on recent adversarial imitation approaches, we propose a robust IL algorithm to learn policies that can effectively transfer to the real environment without fine-tuning. Furthermore, we empirically demonstrate on continuous-control benchmarks that our method outperforms the state-of-the-art state-only IL method in terms of the zero-shot transfer performance in the real environment and robust performance under different testing conditions.
Original languageEnglish
Title of host publicationProceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022)
EditorsCatherine Pelachaud, Matthew E. Taylor, Piotr Faliszewski, Viviana Mascardi
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems
Pages1337-1345
Number of pages9
ISBN (Print)978-1-4503-9213-6
DOIs
Publication statusPublished - 9 May 2022
Event21st International Conference on Autonomous Agents and Multiagent Systems - Auckland, New Zealand
Duration: 9 May 202213 May 2022
https://aamas2022-conference.auckland.ac.nz/

Conference

Conference21st International Conference on Autonomous Agents and Multiagent Systems
Abbreviated titleAAMAS 2022
Country/TerritoryNew Zealand
CityAuckland
Period9/05/2213/05/22
Internet address

Keywords / Materials (for Non-textual outputs)

  • Sim-to-real transfer
  • Imitation Learning
  • Learning from Observation
  • Robust Reinforcement Learning

Fingerprint

Dive into the research topics of 'Robust learning from observation with model misspecification'. Together they form a unique fingerprint.

Cite this