Learning Constrained Generalizable Policies by Demonstration

Leopoldo Armesto, João Moura, Vladimir Ivan, Antonio Salas, Sethu Vijayakumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Many practical tasks in robotic systems, such as cleaning windows, writing or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. We propose a locally weighted constrained projection learning method (LWCPL) that first estimates the constraint and then exploits this estimate across multiple observations of the constrained motion to learn an unconstrained policy. The generalization is achieved by projecting the unconstrained policy onto a new, previously unseen, constraint. We do not require any prior knowledge about the task or the policy, so we can use generic regressors to model the task and the policy. However, any prior beliefs about the structure of the motion can be expressed by choosing task-specific regressors. In particular, we can use robot kinematics and motion priors to improve the accuracy. Our evaluation results show that LWCPL outperform the state of the art method in accuracy of learning the constraints as well as the unconstrained policy, even in noisy conditions. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force/torque based controller to achieve tool alignment. We show that, despite of the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion.
Original languageEnglish
Title of host publicationProceedings of Robotics: Science and System XIII
Number of pages9
DOIs
Publication statusPublished - 16 Jul 2017
EventRobotics: Science and System XIII - Cambridge, United States
Duration: 12 Jul 201716 Jul 2017
http://rss2017.lids.mit.edu/

Conference

ConferenceRobotics: Science and System XIII
Abbreviated titleRSS 2017
Country/TerritoryUnited States
CityCambridge
Period12/07/1716/07/17
Internet address

Fingerprint

Dive into the research topics of 'Learning Constrained Generalizable Policies by Demonstration'. Together they form a unique fingerprint.

Cite this