Projects per year
Abstract
We address the problem faced by an autonomous agent that
must achieve quick responses to a family of qualitativelyrelated
tasks, such as a robot interacting with different types
of human participants. We work in the setting where the tasks
share a state-action space and have the same qualitative objective
but differ in the dynamics and reward process. We
adopt a transfer approach where the agent attempts to exploit
common structure in learnt policies to accelerate learning in
a new one. Our technique consists of a few key steps. First,
we use a probabilistic model to describe the regions in state
space which successful trajectories seem to prefer. Then, we
extract policy fragments from previously-learnt policies for
these regions as candidates for reuse. These fragments may
be treated as options with corresponding domains and termination
conditions extracted by unsupervised learning. Then,
the set of reusable policies is used when learning novel tasks,
and the process repeats. The utility of this method is demonstrated
through experiments in the simulated soccer domain,
where the variability comes from the different possible behaviours
of opponent teams, and the agent needs to perform
well against novel opponents.
Original language | English |
---|---|
Title of host publication | Lifelong Machine Learning: Papers from the 2013 AAAI Spring Symposium |
Publisher | AAAI Press |
Pages | 21-26 |
Number of pages | 6 |
Publication status | Published - 2013 |
Fingerprint
Dive into the research topics of 'Lifelong Learning of Structure in the Space of Policies.'. Together they form a unique fingerprint.Projects
- 2 Finished
-
TOMSY: Topology Based Motion Synthesis for Dextrous Manipulation
Vijayakumar, S., Komura, T. & Ramamoorthy, R.
1/04/11 → 31/03/14
Project: Research
-
Topology-based Motion Synthesis
Komura, T., Ramamoorthy, R. & Vijayakumar, S.
30/09/10 → 28/02/14
Project: Research