A Multitask Representation Using Reusable Local Policy Templates.

Benjamin Saul Rosman, Subramanian Ramamoorthy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Constructing robust controllers to perform tasks in large, continually changing worlds is a difficult problem. A long-lived agent placed in such a world could be required to perform a variety of different tasks. For this to be possible, the agent needs to be able to abstract its experiences in a reusable way. This paper addresses the problem of online multitask decision making in such complex worlds, with inherent incompleteness in models of change. A fully general version of this problem is intractable but many interesting domains are rendered manageable by the fact that all instances of tasks may be described using a finite set of qualitatively meaningful contexts. We suggest an approach to solving the multitask problem through decomposing the domain into a set of capabilities based on these local contexts. Capabilities resemble the options of hierarchical reinforcement learning, but provide robust behaviours capable of achieving some subgoal with the associated guarantee of achieving at least a particular aspiration level of performance. This enables using these policies within a planning framework, and they become a level of abstraction which factorises an otherwise large domain into task-independent sub-problems, with well-defined interfaces between the perception, control and planning problems. This is demonstrated in a stochastic navigation example, where an agent reaches different goals in different world instances without relearning.
Original languageEnglish
Title of host publicationAAAI Spring Symposium: Designing Intelligent Robots
PublisherAAAI Press
Number of pages6
Publication statusPublished - 2012

Publication series

NameAAAI Technical Report: Designing Intelligent Robots: Reintegrating AI
NumberSS-12-02

Fingerprint

Dive into the research topics of 'A Multitask Representation Using Reusable Local Policy Templates.'. Together they form a unique fingerprint.

Cite this