Abstract
The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. In this paper, we propose a new basis function based on geodesic Gaussian kernels, which exploits the non-linear manifold structure induced by the Markov decision processes. The usefulness of the proposed method is successfully demonstrated in simulated robot arm control and Khepera robot navigation.
| Original language | English |
|---|---|
| Pages (from-to) | 287-304 |
| Number of pages | 18 |
| Journal | Autonomous Robots |
| Volume | 25 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - 2008 |
Keywords / Materials (for Non-textual outputs)
- Reinforcement learning
- Value function approximation
- Markov decision process
- Least-squares policy iteration
- Gaussian kernel
Fingerprint
Dive into the research topics of 'Geodesic Gaussian kernels for value function approximation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver