On Two Continuum Armed Bandit Problems in High Dimensions

Hemant Tyagi, Sebastian U. Stich, Bernd Gärtner

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the problem of continuum armed bandits where the arms are indexed by a compact subset of Rd. For large d, it is well known that mere smoothness assumptions on the reward functions lead to regret bounds that suffer from the curse of dimensionality. A typical way to tackle this in the literature has been to make further assumptions on the structure of reward functions. In this work we assume the reward functions to be intrinsically of low dimension k ≪ d and consider two models: (i) The reward functions depend on only an unknown subset of k coordinate variables and, (ii) a generalization of (i) where the reward functions depend on an unknown k dimensional subspace of Rd. By placing suitable assumptions on the smoothness of the rewards we derive randomized algorithms for both problems that achieve nearly optimal regret bounds in terms of the number of rounds n.
Original languageEnglish
Pages (from-to)191-222
Number of pages32
JournalTheory of Computing Systems
Volume58
Issue number1
Early online date12 Sep 2014
DOIs
Publication statusPublished - Jan 2016

Fingerprint

Dive into the research topics of 'On Two Continuum Armed Bandit Problems in High Dimensions'. Together they form a unique fingerprint.

Cite this